UBC-MDS / 522-Group_30-Rockstars

DSCI 522 Data Science Workflows team project
MIT License
0 stars 8 forks source link

Peer Review Feedback (from Rahul Kuriyedath) #53

Closed rahulkuriyedath closed 3 years ago

rahulkuriyedath commented 3 years ago

Hi Group 30,

Please find below my feedback for the work you have done until milestone 3. :

  1. I faced the below error when I ran ‘make all’ in the terminal:
    
    rahulkuriyedath@Rahuls-MBP:~/Desktop/Labs/522-Group_30-Rockstars$ make allpython src/download_data.py --url=https://data.strathcona.ca/api/views/c9fr-ivqf/rows.csv?accessType=DOWNLOAD \
    --out_file=data/2018_Property_Tax_Assessment.csv
    python src/data_cleaning.py --in_file=data/2018_Property_Tax_Assessment.csv \
    --out_file1=data/2018_Property_Tax_Assessment_clean.csv \
    --out_file2=data/2018_Property_Tax_Assessment_clean_train.csv \
    --out_file3=data/2018_Property_Tax_Assessment_clean_test.csv
    python src/eda_script.py --in_file1=data/2018_Property_Tax_Assessment_clean.csv \
    --in_file2=data/2018_Property_Tax_Assessment_clean_train.csv --output_file=results/ 
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 25605 entries, 0 to 25604
    Data columns (total 12 columns):
    #   Column      Non-Null Count  Dtype  
    ---  ------      --------------  -----  
    0   YEAR_BUILT  25597 non-null  float64
    1   ASSESSCLAS  25605 non-null  object 
    2   BLDG_DESC   25597 non-null  object 
    3   BLDG_FEET   25605 non-null  int64  
    4   GARAGE      25605 non-null  object 
    5   FIREPLACE   25605 non-null  object 
    6   BASEMENT    25605 non-null  object 
    7   BSMTDEVL    25605 non-null  object 
    8   ASSESSMENT  25605 non-null  int64  
    9   LATITUDE    25605 non-null  float64
    10  LONGITUDE   25605 non-null  float64
    11  AGE         25597 non-null  float64
    dtypes: float64(4), int64(2), object(6)
    memory usage: 2.3+ MB
    SyntaxError: Unexpected token ']'
    at Function (<anonymous>)
    at Object.field (/Users/rahulkuriyedath/opt/miniconda3/envs/571/lib/vega-cli/node_modules/vega-util/build/vega-util.js:110:7)
    at Object.getField [as parse] (/Users/rahulkuriyedath/opt/miniconda3/envs/571/lib/vega-cli/node_modules/vega-runtime/build/vega-runtime.js:132:47)
    at parseParameter (/Users/rahulkuriyedath/opt/miniconda3/envs/571/lib/vega-cli/node_modules/vega-runtime/build/vega-runtime.js:80:18)
    at /Users/rahulkuriyedath/opt/miniconda3/envs/571/lib/vega-cli/node_modules/vega-runtime/build/vega-runtime.js:65:42
    at Array.map (<anonymous>)
    at parseParameters (/Users/rahulkuriyedath/opt/miniconda3/envs/571/lib/vega-cli/node_modules/vega-runtime/build/vega-runtime.js:65:17)
    at parseOperatorParameters (/Users/rahulkuriyedath/opt/miniconda3/envs/571/lib/vega-cli/node_modules/vega-runtime/build/vega-runtime.js:220:9)
    at /Users/rahulkuriyedath/opt/miniconda3/envs/571/lib/vega-cli/node_modules/vega-runtime/build/vega-runtime.js:320:7
    at Array.forEach (<anonymous>)
    python src/housing_assessment_prediction.py \
    --in_file1=data/2018_Property_Tax_Assessment_clean_train.csv \
    --in_file2=data/2018_Property_Tax_Assessment_clean_test.csv \
    --out_file1=results/validation_table.csv --out_file2=results/test_score.csv --out_file3=results/coefficients_table.csv
    Rscript -e "rmarkdown::render('doc/strathcona_housing_price_predict_report.Rmd')"

processing file: strathcona_housing_price_predict_report.Rmd |.... | 5% ordinary text without R code

|....... | 11% label: setup (with options) List of 1 $ include: logi FALSE

Quitting from lines 12-16 (strathcona_housing_price_predict_report.Rmd) Error in library(reticulate) : there is no package called 'reticulate' Calls: ... withCallingHandlers -> withVisible -> eval -> eval -> library

Execution halted make: *** [doc/strathcona_housing_price_predict_report.Rmd] Error 1



2. In the readme.MD file in the root directory, in the ‘about’ section the first paragraph seems to be talking about breast cancer prediction while your project title says housing price prediction. Probably this was just a miss, and you could easily correct it :) 

3. In the dependency section, it might be nice to include all the dependencies in a yaml file as it will make the process of reproducing the environment easier.

4. There is no section in the readme that tells how I should run the make file commands. You could add a line in the readme something like - ‘to reproduce this project, run the command `make all` in the terminal ‘.

5. For the plots scatter.png and barchart.png, the render looked a little funny in my local system (it had a dark background with dark text which made it a little difficult to read). But I previewed the same on the github repo and it looked good, just thought I will give you a heads up in case you want to look into it.

6. In the strathcona_housing_price_predict_report.md, I see an empty references section but it is not populated with any details

I hope the above points help you in polishing what otherwise looks like a great project! Please let me know if you need any clarifications. 
danielon-5 commented 3 years ago

Hi Rahul,

We appreciate your great feedback and kind words. I was also struggling with the reticulate library. It should be all fixed with the new docker image. We will also address all your other comments.

Thank you for your time and for collaborating to this project! =)