Open zzhzoe opened 2 years ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 1.5hrs
This is an amazing package, if developed to it's full potential, I am sure it can save so much time for a data scientists and business analysts. It is stated that around 60% of the time of any project goes into cleaning the data. If this time is saved, people can focus on building models and making predictions/inferences. I see huge potential in this package, however, I have a few concerns:
Overall, I was able to install the package and use it on a few toy datasets. All the functions within the package are working well and as per the expectation. Kudos to the team for ideating and making this package. I am sure if developed to the fullest, this package can rock the data-science world!!
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Estimated hours spent reviewing: 1 hr
Congratulations on coming up with a very useful package idea and successfully delivering it team!
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Good job team! Thank you for coming up with such a useful package, and I can't wait to use it in my real later project. however, I still have a few suggestions to make this function more versatile:
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 1 hour
Great work! You created a package with so many functions. Some of them could be useful for actual EDA work in the future. Some thoughts on potential improvements:
StandardScaler()
or OneHotEncoder()
Great work with defensive programming. All functions were very well thought in terms of edge cases and what warnings/errors should be raised
Submitting Author Name: Zihan Zhou @zzhzoe Package Name: simplefit Editor: Mohammadreza Mirzazadeh @rezam747 Navya Dahiya @nd265 Sanchit Singh @Sanchit120496 One-Line Description of Package: A python package that cleans the data, does basic EDA and returns scores for basic classification and regression models。 Repository Link:https://github.com/UBC-MDS/simplefit Version submitted: v0.1.4 Editor: TBD Reviewer 1: Abhiket Gaurav @Abhiket Reviewer 2: Sufang Tan @Kendy-Tan Reviewer 3: Lakshmi Santosha Valli Akella @valli180 Reviewer 4: Pavel Levchenko @plevchen Archive: TBD Version accepted: TBD
Description
This package helps data scientists to clean the data, perform basic EDA, visualize graphical interpretations and analyse performance of the baseline model and basic Classification or Regression models, namely Logistic Regression, Ridge on their data.
Scope
* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.
This package offers utility functions that provide the basic EDA visualizations including histograms, SPLOM plot and correlation plot. We provide a package that saves 90% of the time spent in writing the same code for different graphs, comparing scores of models
Any data professionals at the entry-level who would like to conduct a quick exploratory data analysis. A data scientist spends a lot of time writing same syntactical code for carrying out data processing, transformations, fitting models and comparing their performances.
Of course, EDA is not a new topic for data scientists. There are quite a few packages on PyPI that do similar work. However, most of them only include limited functionality, such as providing only descriptive statistics. But our package helps data scientists to clean the data, perform basic EDA, visualize graphical interpretations and analyse performance of the baseline model and basic Classification or Regression models, namely Logistic Regression, Ridge on their data.
@tag
the editor you contacted:Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication options
JOSS Checks
- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
P.S. *Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
Editor and review templates can be found here