Open anasm-17 opened 4 years ago
Please check off boxes as applicable, and elaborate in the comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 1
Hey folks, good job! Your package works! As far as I can tell. Here is some feedback.
Your online documentation is broken. For instance, the index page is empty and the model index page does not exist.
The model_comparison_table
function is a bit limiting. Currently, it only supports the default scoring built-in in Scikit Learn ML models, but in many cases, the user will want to use other scoring methods for different ML projects. Please consider supporting those so that the function can be more useful in various scenarios. In addition, despite the fact that you have mentioned that the function takes in Scikit Learn ML models in the description, please also specify that in the argument description.
The plot_roc
function example in the README doc could be misleading. The first argument takes a fit model, but the example, plot_roc(SVC(), X_valid, y_valid)
, would indicate otherwise. If someone runs that code, that person would run into an error.
The plot_train_valid_error
function should take a model instance instead of a name in order to allow passing in non-default hyperparameters to the models as opposed to only one of the allowed ones in the function. This is because, at its current state, the function has much-unrealized potential. Hyperparameters interact with each other, meaning one hyperparameter value may produce a different result with different values of another hyperparameter. Basically, if we are only allowed tuning of one individual hyperparameter, I think the function has extremely limited practicality.
For the plot functions, perhaps you could allow a basic level of customizations for the graphs. For instance, it might be useful to change the graph's size and colors. Passing in these options directly to the plotting method could be quite easily done.
Please check off boxes as applicable, and elaborate in the comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 3 hours
Overall the documentation is clear and complete however it was noticed that the docstrings for the function model_comparison_table could be made more clear. The examples of the function have terms like lr_model and lr_trained which do not give the user what exactly to pass into the function.
The unit tests are well thought out and cover different edge scenarios.
I could not install the package from the documentation provided in the README. The package did not install with numpy 2.0.0 and required numpy 1.18.1 However the dependencies in the README say the package should run with a numpy greater than 1.18.1. Similarly, the package did not install with pandas 2.0.0 . It would be good to have the exact dependencies in the README. Ultimately I cloned the package so that I could get it going and test the functions.
I tried the package with the Iris dataset and the results are commendable. All the functions do the job as specified in the documentation. Even though the installation did give me some heartburn overall it is an impressive package.
Thank you @jackyho112 and @ManishPJoshi for your comments. We have implemented the following changes from your comments:
plot_roc
from @jackyho112 second comment.model_comparison_table
have been made clearer from @ManishPJoshi second 1st comment. Thank you both for taking the time to write meaningful reviews for our package, although we could not implement all the changes suggested, we tried to implement what was feasible 😄
The updated version of the package can be found here: https://github.com/UBC-MDS/pymlviz/tree/1.2.0
Submitting Authors: Anas Muhammad (@anasm-17), Fanli Zhou(@flizhou), Tao Huang (@taohuang-ubc), and Mike Chen (@miketianchen) Package Name: pymlviz One-Line Description of Package: Visualize and compare the performance of machine learning models. Repository Link: https://github.com/UBC-MDS/pymlviz Version submitted: 1.1.13 Editor: Varada Kolhatkar (@kvarada ) Reviewer 1: Jacky Ho (@jackyho112 ) Reviewer 2: Manish Joshi (@ManishPJoshi ) Archive: TBD
Version accepted: TBD
Description
This package contains four functions to allow users to conveniently plot various visualizations as well as compare the performance of different classifier models. The purpose of this package is to reduce time spent on developing visualizations and comparing models, to speed up the model creation process for data scientists.
Scope
* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.
Machine learning model developers and data scientists.
@tag
the editor you contacted:Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication options
JOSS Checks
- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements): "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
Editor and review templates can be found here