Open AndyYang80 opened 2 years ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 1.5hr
R_assumption_test
is used while on README the package name is using the abbreviation lrasm
. A consistent package name can definitely make others easier to refer and know about your package.homoscedasticity_test
would be super helpful.Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
from lrasm.homoscedasticity_tst import homoscedasticity_test
from lrasm.multicollinearity_tst import multicollinearity_test
from lrasm.normality_tst import normality_test
from sklearn import datasets
import pandas as pd
import matplotlib.pyplot as plt
data = datasets.load_iris() iris_df = pd.DataFrame(data=data.data, columns=data.feature_names) X = iris_df.drop("sepal width (cm)", axis = 1) y = iris_df["petal width (cm)"]
p_value, res = normality_test(X, y)
corr_df, plot = homoscedasticity_test(X, y)
vif_df = multicollinearity_test(X, VIF_thresh = 10)
2.1. There is a small typo of `djustments` (should have been `adjustments`) and there is no full stop at the end of the sentence; and
2.2. I would suggest that you classify your messages into 2 levels, say, "warning" and "error" or "moderate" and "severe". You already have something like `possible` multicollinearity and `definite` there. Perhaps you can make that like the title of the message so that it is more readable;
3. While I think a short name for the package has its own advantage, but it is a bit cryptic and not easy to remember. Perhaps it could have been called something like `lr-regressor-testing`, just my two cents;
4. For the function `homoscedasticity_test()`, there should be an argument for the value "alpha". Also the p-value being show is only to 1 decimal point. It should show at least 3 decimal points for common purpose of hypothesis testing;
5. For the function `normality_test()`, it is good to have such a function for easy testing, but I would suggest that the threshold (optional argument) should be shown in the example as well;
6. `multicollinearity_test` is beautifully done. The summary is good with the data frame of all the VIFs. Just an idea for future development that perhaps the function can offer to simulate the removal of the feature with the highest VIF and recalculate the VIFs for the subset of features;
7. Under the `Contributing` section of at least.readthedocs.io, it is mentioned that "email" is a way to contact the authors. But the problem is that I do not see any email address being posted anywhere. I believe if creating an issue on the GitHub repo is the main channel, there is no need to mention "email";
8. In the `CONTRIBUTING.md` file, the package is called `LR_assumption_test` instead of `lrasm`, which is a bit confusing; and
9. Last but not least, I like the 3 functions. They are really useful.
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README contents:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing:
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
setup.py
file or elsewhere.Readme requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Submitting Authors:
Package Name: lrasm One-Line Description of Package: Package for testing linear and multiple linear regression assumptions Repository Link: https://github.com/UBC-MDS/lrasm Version submitted: 1.0.0 Editor: TBD
Reviewers:
Archive: TBD
Version accepted: TBD
Description
This package is built to contain functions to be able to quickly and easily test the linearity assumptions befre preforming linear regression or multiple linear regression for a specified dataset. The package contains 3 functions one for checking multicolliniarity, one for checking constant variance and one for checking normality in the residuals.
Scope
* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.
Explain how the and why the package falls under these categories (briefly, 1-2 sentences):
Our package extracts validity insights about linear regression assumptions from existing dataframes
Who is the target audience and what are scientific applications of this package?
We are catering this package towards data scientists, students, or working professionals who need to validate the validity of their linear regression models.
Are there other Python packages that accomplish the same thing? If so, how does yours differ?
There are no packages that accomplish the same functionality.
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted:Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication options
JOSS Checks
- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
P.S. *Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
Editor and review templates can be found here