THF-d8 / Test_repo

0 stars 0 forks source link

a #1

Open THF-d8 opened 1 year ago

THF-d8 commented 1 year ago

a

THF-d8 commented 1 year ago

Data analysis review checklist

Reviewer:

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 1.5

Review Comments:

  1. The project is very well done with an excellent background introduction, a very well-thought research approach and a beautifully written analysis report.
  2. I particularly like the fact that the threshold was selected in order to maximize recall and keep the false negatives low, since the false negatives are definitely very harmful in the prediction of cervical cancer!
  3. The main function is a little bit long in the model_training.py script. It might be better to split up this function into several small functions with one corresponding to each model, and make calls to those functions in the main(). Another way to approach this could be creating a separate script for the training of each model. There is nothing wrong with the way it is laid out currently, and my suggestion is only for the purpose of improving readability.
  4. There are some error messages in the cervical_cancer_data_eda.ipynb file, where it says background_gradient requires matplotlib, potentially due to matplotlib not being imported.
  5. It might be good to have some subdirectories in the results folder to keep the files more organized, e.g., one for PR curve files, one for threshold files, etc.

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.