Milestone 2 - Feedback - Githubissues

Congratulations on finishing milestone 2! We can see you put a lot of work into this project, nice work! Below we list some specific feedback you can use to improve your project. We provide tick boxes for you to use in the future as you address these concerns to improve the final grade of your project. If anything is unclear, please feel free to ask questions in this issue thread.

1. Analysis code and scripts

[ ] Command line arguments are missing from one or more scripts -3 accuracy

Comment: The usage section does not provide proper path to the scripts. It should be “/src/script.py”. Also, is there a reason that you are specifically suggesting to run the scripts with python3? That can cause issues across platform. Either clarify why, or just use python. (I do not have python3 as a cmdlet.)

[ ] Code in one or more scripts is not of high quality, or does not adhere to the adopted code styles for the course: author, date and where the script has been extracted from (if it is a re-used script), should be added -2 quality

Comment: overall, please try to add more documentation to your codes. The author and date are necessary.

[ ] The proposal.md file has been created and moved to the doc directory -2 mechanics

Comment: Refer to the Milestone 2 submission guidelines for details on proposal.md file.

[ ] The literate code document (R Markdown or Jupyter notebook) does not use code to format table descriptions and figure captions and automatically number them -4 quality

Comment: The reports in the repository do not have figure/table captions or titles. Also, the Rmd file can not be rendered. In the Rmd file, the figures have captions, but not the tables.

[ ] The literate code document (R Markdown) does not use code to automatically update inline values generated from the analysis that are presented in the narrated text of the document -2 quality

Comment: Use inline variables to report the scores in the text of the report.

[ ] There are cases of functions that do not check for proper input and/or do not provide useful error messages. -1 mechanics
[ ] One or more function documentations are missing -5 quality

Comment: add appropriate exception errors and check for input values. Some parts of the functions are unclear, add documentation in the beginning about what exactly the function does. For example, the function that train a ml algorithm (also the script), needs some documentation in the beginning of the code, mentioning what models are being trained.

[ ] There are cases of functions with tests that fail. -3 accuracy

Comment: some functions do not have proper tests, or not documented. You can use unit testing to solve this. Here’s a nice article on unit testing. But simply, you can check for input type discrepancies and raise proper errors manually.

2. Analysis can be run reproducibly using documentation provided in the README

[ ] Could not reproducibly run the analysis because a package was missing from the provided computational environment. -5 reasoning

Comment: vl-convert was not part of the reported dependencies, but was required. Also, the name of the packages should be exactly the same as what would be needed to install them. e.g: docopt-ng, pandas-profiling

[ ] Could not reproducibly run the analysis because the code fails. -5 accuracy

Comment: There is no mention of how to render the final report. I tried Rscript command to render it, but it raised an error.

3. Submission expectations

[ ] Please provide the link to the release as https://github.com/UBC-MDS/REPOSITORY/tree/VERSION (replacing REPOSITORY and VERSION with the correct ones for your project version for each milestone). This makes it a lot easier for someone to just visit the repository as it was at that point in the project's history. -3 mechanics

@flor14 @danielramandi

On behalf of all authors, thank you for your detailed evaluation and valuable feedback on our project “Credit_Card_Default_Prediction_Group13” release 0.1.0 (https://github.com/UBC-MDS/Credit_Card_Default_Prediction_Group13/tree/v0.1.0). It was really an amazing experience building our first team project together on GitHub and we hoped to make the project something we can put on our CV or describe to an interviewer. However, we have carefully reviewed your comments and felt discouraged by the marking. Also, given that our milestone 1 feedback came 5 days past the MS2 deadline, some issues from the MS1 carried over to this release and had marks deducted. Here, we are providing a point-to-point response to further explain some of the choices made/circumstances:

“Congratulations on finishing milestone 2! We can see you put a lot of work into this project, nice work! Below we list some specific feedback you can use to improve your project. We provide tick boxes for you to use in the future as you address these concerns to improve the final grade of your project. If anything is unclear, please feel free to ask questions in this issue thread.”

Thank you very much for the words of encouragement. As stated above, we really were interested in carrying out this project.

1. Analysis code and scripts

[x] Command line arguments are missing from one or more scripts -3 accuracy

Comment: The usage section does not provide proper path to the scripts. It should be “/src/script.py”. Also, is there a reason that you are specifically suggesting to run the scripts with python3? That can cause issues across platform. Either clarify why, or just use python. (I do not have python3 as a cmdlet.)

We have changed the relative paths for the script for our newest release. At the time of the first release, it was documented to run the code under the src folder (however we admit that this is not very clear). Also, this was already deducted in the milestone 1 issue that was given to us 5 days passed the deadline for milestone 2 submission. We feel that this mark deduction is unfair since we are not able to make corresponding changes before the submission deadline (https://github.com/UBC-MDS/Credit_Card_Default_Prediction_Group13/issues/44).

To answer the specific use of Python3, Python 2 is no longer in use since 2020, and because Python 3 and Python 2 do not share the exact same language syntax, many popular packages are developed in Python 3 instead of Python 2. Hence, we wanted to be specific about the Python version used to avoid additional problem during installation, since environment was a required part of this release. We have included a sentence in the Readme about the usage of Python 3. We are wondering if the lost marks on this part can be rewarded?

[ ] Code in one or more scripts is not of high quality, or does not adhere to the adopted code styles for the course: author, date and where the script has been extracted from (if it is a re-used script), should be added -2 quality

Comment: overall, please try to add more documentation to your codes. The author and date are necessary.

We have added the author, date, and general description to all our scripts. Thank you.

[ ] The proposal.md file has been created and moved to the doc directory -2 mechanics

Comment: Refer to the Milestone 2 submission guidelines for details on proposal.md file.

We felt that this mark deduction is unfair since the proposal.md is optional in milestone 2 expectations. To quote the rubric: “Yes, we are asking you to overwrite your proposal here, and so if you want to keep that in a more findable place than in the Git history, do create a new file in the doc directory to archive it there (e.g., proposal.md).”

[ ] The literate code document (R Markdown or Jupyter notebook) does not use code to format table descriptions and figure captions and automatically number them -4 quality

Comment: The reports in the repository do not have figure/table captions or titles. Also, the Rmd file can not be rendered. In the Rmd file, the figures have captions, but not the tables.

The report rendered in three out of four computers in our group. However, we had identified and fixed a bug causing the rendering problem (in importing a correlation matrix and formatting using kable()). All figures and tables included captions, this can be observed on line 125 (https://github.com/UBC-MDS/Credit_Card_Default_Prediction_Group13/blob/v0.1.0/doc/report.Rmd).

[ ] The literate code document (R Markdown) does not use code to automatically update inline values generated from the analysis that are presented in the narrated text of the document -2 quality

Comment: Use inline variables to report the scores in the text of the report.

Thank your for catching this problem. We have included vectorized inline variables in our most-recent release.

[ ] There are cases of functions that do not check for proper input and/or do not provide useful error messages. -1 mechanics

We have missed this requirement. Can you kindly point to us where it is stated to include the tests? Nevertheless, we have added tests using “assert” in our current release.

[ ] One or more function documentations are missing -5 quality

Comment: add appropriate exception errors and check for input values. Some parts of the functions are unclear, add documentation in the beginning about what exactly the function does. For example, the function that train a ml algorithm (also the script), needs some documentation in the beginning of the code, mentioning what models are being trained.

Thank you pointing this out. We have been deducted marks for this in one of your earlier comments. Also, regarding function annotations, we followed the style in the example repo by Prof. Timbers (https://github.com/ttimbers/breast_cancer_predictor/tree/v2.0/src) and annotated each major step of the script. We are wondering if this mark deduction can be reversed?

[x] There are cases of functions with tests that fail. -3 accuracy

Comment: some functions do not have proper tests, or not documented. You can use unit testing to solve this. Here’s a nice article on unit testing. But simply, you can check for input type discrepancies and raise proper errors manually.

As responded to one of the earlier comments, we have added tests using “assert” in our current release. We are wondering if this mark deduction can be reversed since it’s already deducted in your earlier comment?

2. Analysis can be run reproducibly using documentation provided in the README

[ ] Could not reproducibly run the analysis because a package was missing from the provided computational environment. -5 reasoning

Comment: vl-convert was not part of the reported dependencies, but was required. Also, the name of the packages should be exactly the same as what would be needed to install them. e.g: docopt-ng, pandas-profiling

Thank you for mentioning this. We have added the environment.yaml file for easy-installation in this release. However, we felt the mark deducton is unfair since we received mark deduction in milestone 1 but the feedback was 5 days passed the deadline for milestone 2 (https://github.com/UBC-MDS/Credit_Card_Default_Prediction_Group13/issues/44).

[ ] Could not reproducibly run the analysis because the code fails. -5 accuracy

Comment: There is no mention of how to render the final report. I tried Rscript command to render it, but it raised an error.

We have fixed the bug that’s causing kable to not render properly in one of the tables. Thank you for pointing this out. However, we are wondering how much total points are assigned for this requirement and if a 5-points deduction is too much?

3. Submission expectations

[ ] Please provide the link to the release as https://github.com/UBC-MDS/REPOSITORY/tree/VERSION (replacing REPOSITORY and VERSION with the correct ones for your project version for each milestone). This makes it a lot easier for someone to just visit the repository as it was at that point in the project’s history. -3 mechanics

Thank you for pointing this out. We have fixed this in our recent submission. We were confused by the requirement and thought it was asking us to submit the release note page. The release 0.1.0 can be accessed here https://github.com/UBC-MDS/Credit_Card_Default_Prediction_Group13/tree/v0.1.0

UBC-MDS / Credit_Card_Default_Prediction_Group13

Milestone 2 - Feedback #56