Open charlesxch opened 10 months ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 2
Overall, nicely designed package with especially detailed usage instructions in the vignette. It not only demonstrates how to use the functions, but also is being quite technical especially when it comes to the metrics. Just a few things:
.fit()
method is missing.pyproject.toml
.README.md
needs more badges.README.md
. It is currently lacking demonstration of usage of the functions or a link to the vignette.README.md
):
CONTRIBUTING.md
. This instruction works, but should be reflected in the installation instructions in README.md
as well since using pip to install does not seem to work at the moment.ci-cd.yml
file is missing.pytest-cov
in the pyproject.toml
file.**Note:
Comment: refers to general comments / review feedback.
Fix: refers to a concern for the team to fix (or consider fixing) for this milestone.**
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
Comment: Included in ReadMe.
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
Comment: Included in ReadMe.
[ ] Vignette(s) demonstrating major functionality that runs successfully locally.
Comment: Your vignette is excellent – it’s very well written and informative. It’s clear that you put a lot of work into this. Great work!!
Fix: Your latex equations for the R^2 at the end of your vignette did not render properly in ReadTheDocs.
Fix: Cell 6 is quite a large code block for a beginner to understand. Recommend adding a comment before each step to link with the list of steps in the markdown cell above and improve readability.
(ex/ “# 1. Extract categorical features:
categorical_features = ['Brand', 'UsedOrNew', 'Transmission', 'DriveType', 'FuelType', 'BodyType', 'ExteriorColour']...”
Fix: Several spelling errors (recommend passing through an editing tool).
[ ] Function Documentation: for all user-facing functions.
Comment: Excellent use of error statements for exception handling in cross_validate.py.
Fix: cross_validate.py - The code is difficult to read (minimal spacing between lines or comments to explain what the code is doing in each step ex/ exception handling vs. scoring / timing the model).
Fix: regressor.py - docstrings are required for all class methods including init() and fit().
Comment: The comments are well done in the regressor.py script and it is easy to read.
[ ] Examples for all user-facing functions.
Fix: cross_validate.py does not have an example of the function inputs/ outputs in the docstring.
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml
file or elsewhere.
Readme file requirements
The package meets the readme requirements below:
The README should include, from top to bottom:
[x] The package name
[ ] Badges for:
Fix: Does not include code coverage – need to add the badge for this.
[x] Docs building (if you have a documentation website),
[ ] A repostatus.org badge,
Fix: Does not include repo status – need to add the badge for this.
Fix: Does not include supported python versions – need to add the badge for this.
Fix: Does not include current package versions – need to add the badge for this.
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
[x] Short description of package goals.
[x] Package installation instructions
[x] Any additional setup required to use the package (authentication tokens, etc.)
[x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
Fix: Include a usage section in the ReadMe.
[x] Link to your documentation website.
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[x] Citation information
Comment: If any citations are required, make sure to include them.
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider whether:
[x] Package documentation is clear and easy to find and use.
[x] The need for the package is clear
[ ] All functions have documentation and associated examples for use
Comment: see above comments in the function documentation and example sections.
[ ] The package is easy to install
Fix: I was not able to install the package - more details below.
[ ] Installation: Installation succeeds as documented.
Fix: Following your installation instructions on the ReadMe throws an error (i.e. pip install ols_regressor doesn’t work). I assume this is because the package is not yet on pypi? Also, there are no commands to install a virtual environment to run the code / install the dependencies.
Fix: No instructions on how to run pytest.
[ ] Functionality: Any functional claims of the software been confirmed.
[ ] Performance: Any performance claims of the software been confirmed.
[ ] Automated tests:
Fix: I couldn’t run the tests, as I couldn’t install the package from the ReadMe instructions. Recommend to double check that your tests are working.
Comment: test scripts have good coverage.
[ ] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
Fix: CI workflow has not been set up in Github actions.
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
A few notable highlights to look at:
[x] Package supports modern versions of Python and not End of life versions.
[x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)
Estimated hours spent reviewing:
2 hours.
Overall, the package is very well done, especially your vignette and test scripts! I have no major concerns, except that I was not able to install the package and run the tests/scripts locally. Let me know if you need help testing this as you work on this week’s milestone. 😊
All suggested fixes are available for your review in the relevant sections above (see Fix:). If you have any questions or concerns, please feel free to reach out!
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing:
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Estimated hours spent reviewing: 1 hour 45 minutes
The package is exceptionally well-made, with its functions, vignettes, and test scripts standing out. The effort put into developing the functions is evident. Additionally, the package's overall design is very well done. The vignette is particularly informative, offering comprehensive usage instructions and effectively highlighting the features' functionality with a focus on technical metrics. However, there are some areas that could benefit from further attention:
The file lacks an example for the cross_validate function. It would be helpful to include this for completeness. Also, the file is missing the docstring for the fit function.
The vignette is very well done, as it clearly demonstrates the functions' capabilities and their practical real-world applications. The definitions/explanations of different feature types are particularly helpful for beginners. However, the final section of the vignette needs revision since the LaTeX equations for R^2 are not displaying correctly on ReadTheDocs. Additionally, a thorough review of the text for any grammatical or spelling errors is advisable.
While I successfully installed the package, the installation instructions could be more beginner-friendly. It would be beneficial to start with guidance on cloning the repository using Git, followed by steps for installing Poetry, and ending with the pip install process. Further, adding a usage section in your README would enhance clarity and provide better guidance.
After installing the package and running Pytest, although all tests passed, a warning message appeared. This may be due to the lack of instructions for setting up a virtual environment and installing necessary dependencies. Including a section on dependencies in the README would be advantageous.
Hi @jy1909 . Thank you for your review. We made the several changes based on your suggestions.
- The docstring for the
.fit()
method is missing.
We added the docstring to the method. (https://github.com/UBC-MDS/OLS_regressor/issues/67)
- I recommend adding examples to the docstrings.
All docstrings now contain examples to illustrate. (https://github.com/UBC-MDS/OLS_regressor/issues/68)
- I would recommend including metadata for all authors in
pyproject.toml
.
All author names are included in the toml
file right now. (https://github.com/UBC-MDS/OLS_regressor/issues/73)
- According to the checklist,
README.md
needs more badges.
We have more badges righe now in the README.md
. (https://github.com/UBC-MDS/OLS_regressor/issues/70)
- I would recommend adding a usage section in in
README.md
. It is currently lacking demonstration of usage of the functions or a link to the vignette.
A usage section is added, with some simple illustration on the functions we defined. (https://github.com/UBC-MDS/OLS_regressor/issues/66)
I am encountering errors when pip installing package (following the instruction on
README.md
):
- ERROR: Could not find a version that satisfies the requirement ols_regressor (from versions: none)
- ERROR: No matching distribution found for ols_regressor
- I see that you have the installation instruction of using poetry in
CONTRIBUTING.md
. This instruction works, but should be reflected in the installation instructions inREADME.md
as well since using pip to install does not seem to work at the moment.
The error is bucause we did not publish the package onto PyPI at the time you were tring do intall. It is now available.
- Please set up CI-CD. The
ci-cd.yml
file is missing.
Done with it. (https://github.com/UBC-MDS/OLS_regressor/issues/65)
- I would recommend test for code coverage if this has not been done already since I do not see
pytest-cov
in thepyproject.toml
file.
It is now added to the toml
file. (https://github.com/UBC-MDS/OLS_regressor/issues/74)
Hi @AlysenTownsley . Thank you for your review. We made the several changes based on your suggestions.
Fix: Your latex equations for the R^2 at the end of your vignette did not render properly in ReadTheDocs.
We removed the latex in the notebook. It seems that the Sphinx cannot read our latex.
Fix: Cell 6 is quite a large code block for a beginner to understand. Recommend adding a comment before each step to link with the list of steps in the markdown cell above and improve readability.
Some break-down comments are added. (https://github.com/UBC-MDS/OLS_regressor/issues/80)
Fix: Several spelling errors (recommend passing through an editing tool).
We passed all words to Grammarly and then fix the spelling errors. (https://github.com/UBC-MDS/OLS_regressor/issues/79)
Fix: cross_validate.py - The code is difficult to read (minimal spacing between lines or comments to explain what the code is doing in each step ex/ exception handling vs. scoring / timing the model).
More space and comments are added to the file. (https://github.com/UBC-MDS/OLS_regressor/issues/69)
Fix: regressor.py - docstrings are required for all class methods including init() and fit(). Fix: cross_validate.py does not have an example of the function inputs/ outputs in the docstring.
Docstrings and examples are added. (https://github.com/UBC-MDS/OLS_regressor/issues/67)
Fix: Does not include code coverage – need to add the badge for this. Fix: Does not include repo status – need to add the badge for this. Fix: Does not include supported python versions – need to add the badge for this. Fix: Does not include current package versions – need to add the badge for this.
Badges are added. (https://github.com/UBC-MDS/OLS_regressor/issues/70)
Fix: Include a usage section in the ReadMe.
A usage section is added, with some simple illustration on the functions we defined. (UBC-MDS/OLS_regressor#66)
Fix: I was not able to install the package - more details below. Fix: I couldn’t run the tests, as I couldn’t install the package from the ReadMe instructions. Recommend to double check that your tests are working.
The error is bucause we did not publish the package onto PyPI at the time you were tring do intall. It is now available.
Fix: No instructions on how to run pytest.
Done with it. (https://github.com/UBC-MDS/OLS_regressor/issues/81)
Fix: CI workflow has not been set up in Github actions.
Done with it. (UBC-MDS/OLS_regressor#65)
Hi @weiranzhao97 . Thank you for your review. We made the several changes based on your suggestions.
It is preferred to also include example of usage code in README
A usage section is added, with some simple illustration on the functions we defined. (https://github.com/UBC-MDS/OLS_regressor/issues/66)
The example document is easy to understand and in detail. Personally It would be better to simplify the titles as name of the functions (e.g. Scoring() rather than 'Scoring the Fitted Model'; Predict() instead of 'Predicting with the Fitted Model'). So that I can get to the corresponding func that I have problem with instead of looking through the whole example.
Thank you for your thoughtful suggestion on simplifying the titles in our example document. We've opted to retain the more descriptive titles to support a story-telling approach that enriches the learning experience, believing it provides clear and comprehensive guidance. However, we appreciate your input and will consider ways to enhance navigability in future updates to accommodate quicker reference needs.
It is preferred to include a one-line docstring for test functions.
The docstrings are added now. (https://github.com/UBC-MDS/OLS_regressor/issues/83)
Hi @zgarciaj . Thank you for your review. We made the several changes based on your suggestions.
The file lacks an example for the cross_validate function. It would be helpful to include this for completeness. Also, the file is missing the docstring for the fit function.
Docstrings and examples are added to the functions. (https://github.com/UBC-MDS/OLS_regressor/issues/67, https://github.com/UBC-MDS/OLS_regressor/issues/68)
The vignette is very well done, as it clearly demonstrates the functions' capabilities and their practical real-world applications. The definitions/explanations of different feature types are particularly helpful for beginners. However, the final section of the vignette needs revision since the LaTeX equations for R^2 are not displaying correctly on ReadTheDocs. Additionally, a thorough review of the text for any grammatical or spelling errors is advisable.
We removed the latex in the notebook. It seems that the Sphinx cannot read our latex. We passed all words to Grammarly and then fix the spelling errors. (https://github.com/UBC-MDS/OLS_regressor/issues/79)
While I successfully installed the package, the installation instructions could be more beginner-friendly. It would be beneficial to start with guidance on cloning the repository using Git, followed by steps for installing Poetry, and ending with the pip install process. Further, adding a usage section in your README would enhance clarity and provide better guidance.
We did not publish our package to PyPI at the time of review, making it a bit complicated to do the package installation. Right now the package is available through pip insall
so we believe that the installation is much simpler. Besides, a usage section is added, with some simple illustration on the functions we defined. (https://github.com/UBC-MDS/OLS_regressor/issues/66)
After installing the package and running Pytest, although all tests passed, a warning message appeared. This may be due to the lack of instructions for setting up a virtual environment and installing necessary dependencies. Including a section on dependencies in the README would be advantageous.
We are still working on this issues. We will keep updating our progress with you.
Submitting Author: Charles Xu (@charlesxch) All current maintainers: Yimeng Xia, Sifan Zhang, Waleed Mahmood(@yimengxia, @sifanzzz, @WaleedMahmood1) Package Name: OLS Regressor One-Line Description of Package: The OLS Regression Package is a Python library designed to streamline the process of performing Ordinary Least Squares (OLS) regression analysis. Repository Link: https://github.com/UBC-MDS/OLS_regressor Version submitted: 0.3.0 Editor: Tiffany Timbers Reviewer 1: Weiran Zhao Reviewer 2: Alysen Townsley
Reviewer 3: Jerry Yu Reviewer 4: Zeily Garcia
Archive: TBD JOSS DOI: TBD Version accepted: TBD Date accepted (month/day/year): TBD
Code of Conduct & Commitment to Maintain Package
Description
Scope
Please indicate which category or categories. Check out our package scope page to learn more about our scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
Domain Specific & Community Partnerships
Community Partnerships
If your package is associated with an existing community please check below:
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
Who is the target audience and what are scientific applications of this package?
Are there other Python packages that accomplish the same thing? If so, how does yours differ?
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted:Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication Options
JOSS Checks
- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Confirm each of the following by checking the box.
Please fill out our survey
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
The editor template can be found here.
The review template can be found here.