Milestone 2 Comments - Githubissues

Congratulations on finishing milestone 2! We can see you put a lot of work into this project, nice work! Below we list some specific feedback you can use to improve your project.

We provide tick boxes for you to use in the future as you address these concerns to improve the final grade of your project. If anything is unclear, please feel free to ask questions in this issue thread.

1. Analysis code is abstracted to four (or more) scripts

1.1 Analysis code is abstracted to four (or more) scripts which take command line arguments and save analysis artifacts (e.g., figures and data for tables).

Good Job!

1.2 Report is written in a literate code document, and the code in the literate code document primarily functions to source analysis artifacts and format the document.

[x] The proposal.md file has been created and moved to the doc directory -2 mechanics Comment: The proposal from milestone 1 should be put into a propsal.md file in doc

1.3 Code is abstracted to functions with good function design

[x] There are cases of functions that clearly do more than one thing. -1 reasoning
[x] There are cases of functions that are too big and could be improved by defining utility functions. -1 reasoning

Comment: You should separate the code in main() into smaller utility functions to make the code more readable. You are doing too much in one function

1.4 Functions are well documented

Good Job!

1.5 Code is called in the analysis

Good Job!

(challenging) 1.6 Tests are written for each function, and work expected - Total 7 points if there are test in all the functions and the tests run

From these 7 points remove points if:

[x] There are cases of functions with tests that fail. -3 accuracy
[x] There is no documentation. It is not clear what the tests are testing. -3 quality
[x] There is commented out code. -1 quality Comment: No tests were written

1.9 Dependencies have been updated (so it include the test libraries)

Good Job!

2. Version control

2.1 GitHub Issues

Good Job!

2.2 Follows GitHub flow version control workflow

Good Job!

3. Project file and directory structure organization

Good Job!

3.1 Analysis can be run reproducibly using documentation provided in the README

[x] Clearly specify usage instructions to the point a user could copy and paste them and they would work. If a specific value needs to be inserted suggest the most reasonable value and provide the list (or link to a list) of alternative options. -5 mechanics Comment: You should include the file paths that you used in your analysis so the user can just copy and paste the code into terminal to reproduce your analysis. (ie python src/download_data.py --url="https://archive.ics.uci.edu/ml/machine-learning-databases/credit-screening/crx.data" --out_path=data --filename=crx.csv) rather than using placeholders such as . This allows the user to figure out how you generated the data/crx.csv file.
[x] Could not reproducibly run the analysis because the code fails. -5 accuracy Comment: Missing "python" in front of functions in usage (ie python src/pre_process_crx.py). I think there is a mistake in Test best model file in the usage (should be model_test_script.py rather than best_model_credit_card.py). I also had issues running src/pre_process_crx.py as it runs into an error with the header. Make sure to rerun the whole analysis from scratch to make sure you documented everything you did in the analysis.

4. Submission expectations

[x] Please provide the link to the release as https://github.com/UBC-MDS/REPOSITORY/tree/VERSION (replacing REPOSITORY and VERSION with the correct ones for your project version for each milestone). This makes is a lot easier for someone to just visit the repository as it was at that point in the project's history. -3 mechanics

Other comments

[x] Make sure you knit the credit-appr-predict-report.Rmd so you can view the markdown file on github.

UBC-MDS / Credit_Approval_Prediction

Milestone 2 Comments #33

1. Analysis code is abstracted to four (or more) scripts

1.1 Analysis code is abstracted to four (or more) scripts which take command line arguments and save analysis artifacts (e.g., figures and data for tables).

1.2 Report is written in a literate code document, and the code in the literate code document primarily functions to source analysis artifacts and format the document.

1.3 Code is abstracted to functions with good function design

1.4 Functions are well documented

1.5 Code is called in the analysis

(challenging) 1.6 Tests are written for each function, and work expected - Total 7 points if there are test in all the functions and the tests run

From these 7 points remove points if:

[x] There is commented out code. -1 quality Comment: No tests were written

1.9 Dependencies have been updated (so it include the test libraries)

2. Version control

2.1 GitHub Issues

2.2 Follows GitHub flow version control workflow

3. Project file and directory structure organization

3.1 Analysis can be run reproducibly using documentation provided in the README

4. Submission expectations

Other comments