Open cmmclaug opened 3 years ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 2 hours
Hi aridanalysis team,
Congratulations on completing your package. I think the team did a great job there and I really like the idea behind your package and how this can make data scientist life easy with this package. I enjoyed reviewing your work. Below you can find my suggestions which I hope they maybe helpful for you guys:
Some suggestions I found in the package:
docs
badge into the read the Docs
websiteaa.aridanalysis.arid_eda()
Congrats again and well done.
Zhiyong
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 1.5h
Hi aridanalysis team,
Congratulations on completing your package! I really enjoy reviewing the package. Below you can see some suggestions which I hope will be useful for further improvement:
from aridanalysis import aridanalysis as aa
rather than import aridanalysis as aa
. It might be better if you can clear it in the usage
instructionusage
: in usage
session in README.md, there is a single quote is missing in line 5, a blank is redundant in the last df, and some inputs should be a string rather than a variable. Also, it might be better if you can put the expected output in usage
session. arid_linreg
function, an error was reported: name 'y' is not defined
. The error is because we should input a list of str here. It might be better if you can use a correct example to replace the old one in usage
and add proper error message to indicate this type error.alt.HConcatChart(...)
Hi @wiwang, thank you for taking the time to help us review our package! I will attempt to address your comments below:
The package functions are elaborate and easy to understand, in particular I love to see your team's motivation to create this package
I'm glad you found our descriptions and motivation clear! It was certainly important to clearly define our packages role within the Python regression ecosystem!
It is always good to see related packages with yours, so I can see the difference, and pros and cons.
Thanks! Again with a crowded Python regression package ecosystem we thought it was important to delineate our package niche within the landscape.
I don't see a docs when I click the docs badge into the read the Docs website
Great catch! This is an important one, I've created an issue to get to the bottom of this: https://github.com/UBC-MDS/aridanalysis_py/issues/89
In the usage section, actually I can not run through all sample codes after I install this package:
- maybe you need to call functions like aa.aridanalysis.arid_eda()
- house_prices data set are not imported
- maybe it is better to show the output of all sample function executions like: display the EDA plots and show the regression results
Thanks for helping iron out these discrepancies. You're right, it looks like from the other reviewer we need to update our vignette import statement! Issue created to resolve this here: https://github.com/UBC-MDS/aridanalysis_py/issues/90. We should also add a line to import the house_prices dataset, I've appended to that same issue.
the comments in tests are not as detailed as source code
This is a good point and something to consider as a future improvement to our test suite!
Thanks again for helping us improve aridanalysis_py! If you have any further questions about our package, don't be shy!
Hello @yhchen20, thank you for reviewing our package! I'll try to respond to your insights below:
I follow the instruction in the README.md to install the package and run the examples first, but the function reported an error that "module 'aridanalysis' has no attribute 'arid_eda'". When we use the pacakge, we should use from aridanalysis import aridanalysis as aa rather than import aridanalysis as aa. It might be better if you can clear it in the usage instruction
Agreed! We have an issue here to fix this: https://github.com/UBC-MDS/aridanalysis_py/issues/90
It might be better if you can test the workflow in usage: in usage session in README.md, there is a single quote is missing in line 5, a blank is redundant in the last df, and some inputs should be a string rather than a variable. Also, it might be better if you can put the expected output in usage session.
Thanks for including such detailed analysis! I've added the vignette suggestion fixes to the previous issue!
When I use the example to test arid_linreg function, an error was reported: name 'y' is not defined. The error is because we should input a list of str here. It might be better if you can use a correct example to replace the old one in usage and add proper error message to indicate this type error.
The vignette seems to be out of date! We're going to combine all of the vignette resolution fixes into https://github.com/UBC-MDS/aridanalysis_py/issues/90.
Great job with setting up so many tests and have a coverage rate higher than 90%! However, it might be better to add more tests of input test in the main function. If the input is incorrect, the function could stop early.
Maybe some clarification on this comment is necessary. Are you suggesting adding more defensive checks to the functions?
It might be better if the output plot could show automatically - the current output is alt.HConcatChart(...)
We will have to discuss this to get to the bottom of the reasoning behind this; this is certainly a valid critique!
Thank you again for your efforts in providing feedback. We will use your input to help polish our package!
Submitting Authors:
Package Name: aridanalysis One-Line Description of Package: DRY out your regression analysis! Repository Link: https://github.com/UBC-MDS/aridanalysis_py Version submitted: 0.4.2 Editor: Tiffany Timbers (@ttimbers) Reviewer 1: TBD
Reviewer 2: TBD
Archive: TBD
Version accepted: TBD
Description
As Data Scientists, being able to perform Exploratory Data Analysis as well as Regression Analysis are paramount to the process of analyzing trends in data. Moreover, following the DRY (Do Not Repeat Yourself) principle is regarded as a majority priority for maximizing code quality. Yet, often times Data Scientists facing these tasks will start the entire process from scratch, wasting both time and effort while compromising code quality. The aridanalysis package strives to remedy this problem by giving users an easy-to-implement EDA function alongside 3 robust statistical tests that will simplify these analytical processes and produce an easy to read interpretation of the input data. Users will no longer have to write many lines of code to explore their data effectively.
Scope
* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.
Explain how the and why the package falls under these categories (briefly, 1-2 sentences):
arid_eda
function to produce a number of data visualizations given the data.Who is the target audience and what are scientific applications of this package?
Are there other Python packages that accomplish the same thing? If so, how does yours differ?
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted:Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication options
JOSS Checks
- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
P.S. *Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
Editor and review templates can be found here