UBC-MDS / software-review

MDS Software Peer Review of MDS-created packages
1 stars 0 forks source link

Submission: pyb4model (Python) #28

Open andrealee011 opened 4 years ago

andrealee011 commented 4 years ago

Submitting Author: Andrea Lee (@andrealee011 ), Jaekeun Lee (@agdal1125), Sakariya Aynashe (@eyrakas), Xinwen Wang (@xiw315 ) Package Name: pyb4model One-Line Description of Package: A python package to preprocess data and conduct machine learning Repository Link: pyb4model Version submitted: v1.1.0 Editor: Varada Kolhatkar (@kvarada) Reviewer 1: Cheng Min (@marvinmin) Reviewer 2: Ke Xin Zhao (@Margaret8521) Archive: TBD
Version accepted: TBD


Description

Scope

* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication options

JOSS Checks - [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements): "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Code of conduct

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

marvinmin commented 4 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 2 hours

Review Comments

General Comments I think you have done a good job with this package. It addresses a specific need. There are some issues about the package Documentation and Functionality.

  1. The badge for continuous integration in the README is missing. Would you please add this badge to the README?
  2. There is no citation nor credits information at the end of README. If there is any, please include it.
  3. The styles of docstrings for different functions do not consist. For example, in the docstring for the missing_val function, the style for parameters is image

But in the docstring for the ForSelect function, the style of parameters is image

  1. The packages used in the examples of README are not fully included, which results in the failures of the examples for fit_and_report and ForSelect functions as shown in the following screenshots:

image image image

  1. Some variables used in the examples of README are not defined, which results in the failures of the examples for missing_val, ForSelect and feature_splitter functions as shown in the following screenshots:

image image image

  1. The raised error message does not consist with the code for function missing_val as shown in the following screenshot:

image

If it is possible, please address these issues and I'll review the package again. Thanks.

Margaret8521 commented 4 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:

3h

Review Comments

Compliments:

Suggestions:

1) It might be useful to add usage results in the README.md file. This will help users to see the results from your functions directly and enhance their understanding of your package.

2) For function missing_val:

Screen Shot 2020-03-21 at 3 54 18 PM Screen Shot 2020-03-21 at 3 55 57 PM

3) For function ForSelect:

Screen Shot 2020-03-21 at 4 00 28 PM Screen Shot 2020-03-21 at 4 01 30 PM Screen Shot 2020-03-21 at 4 02 57 PM

4) For function feature_splitter:

Screen Shot 2020-03-21 at 4 05 30 PM

5) Some minor things:

Screen Shot 2020-03-21 at 4 07 31 PM

Great work! I hope these suggestions help, it was fun trying out your package!

andrealee011 commented 4 years ago

Thank you for the feedback! It was extremely hepful! Because of time constraints, we were unable to address all your feedback. Here is a list of the changes we did make:

Please find our new relase here: https://github.com/UBC-MDS/pyb4model/tree/v1.1.1

eyrakas commented 4 years ago

Thank you @marvinmin and @Margaret8521 for your valuable feedback. We have made the changes outlined above by @andrealee011 and @agdal1125. I just wanted to add one more comment on the feature_splitter for the use of OHE. The intended use of the function was to help users visualize what kind of data they are dealing with, i.e. which ones are numerical and which ones are categorical. Kind of quick EDA for them. Not necessarily to do full data preprocessing. But we have changed its output into a data frame which is more presentable to users.