UBC-MDS / software-review-2024

0 stars 0 forks source link

Group 18 - OLS Regressor #21

Open charlesxch opened 10 months ago

charlesxch commented 10 months ago

Submitting Author: Charles Xu (@charlesxch) All current maintainers: Yimeng Xia, Sifan Zhang, Waleed Mahmood(@yimengxia, @sifanzzz, @WaleedMahmood1) Package Name: OLS Regressor One-Line Description of Package: The OLS Regression Package is a Python library designed to streamline the process of performing Ordinary Least Squares (OLS) regression analysis. Repository Link: https://github.com/UBC-MDS/OLS_regressor Version submitted: 0.3.0 Editor: Tiffany Timbers Reviewer 1: Weiran Zhao Reviewer 2: Alysen Townsley
Reviewer 3: Jerry Yu Reviewer 4: Zeily Garcia
Archive: TBD JOSS DOI: TBD Version accepted: TBD Date accepted (month/day/year): TBD


Code of Conduct & Commitment to Maintain Package

Description

Scope

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo

Community Partnerships

If your package is associated with an existing community please check below:

[^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication Options

JOSS Checks - [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Confirm each of the following by checking the box.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

jy1909 commented 10 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 2


Review Comments

Overall, nicely designed package with especially detailed usage instructions in the vignette. It not only demonstrates how to use the functions, but also is being quite technical especially when it comes to the metrics. Just a few things:

AlysenTownsley commented 10 months ago

Package Review

**Note:

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

(ex/ “# 1. Extract categorical features:

categorical_features = ['Brand', 'UsedOrNew', 'Transmission', 'DriveType', 'FuelType', 'BodyType', 'ExteriorColour']...”

Readme file requirements

The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.

Package structure should follow general community best-practices. In general please consider whether:

Functionality

Final approval (post-review)

Estimated hours spent reviewing:

2 hours.

Review Comments

Overall, the package is very well done, especially your vignette and test scripts! I have no major concerns, except that I was not able to install the package and run the tests/scripts locally. Let me know if you need help testing this as you work on this week’s milestone. 😊

All suggested fixes are available for your review in the relevant sections above (see Fix:). If you have any questions or concerns, please feel free to reach out!

weiranzhao97 commented 10 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:


Review Comments

  1. Could include example at the end of docstring of functions for clarity and guidance
  2. There is lack of instructions over installation and testing in README
  3. It is preferred to also include example of usage code in README
  4. The example document is easy to understand and in detail. Personally It would be better to simplify the titles as name of the functions (e.g. Scoring() rather than 'Scoring the Fitted Model'; Predict() instead of 'Predicting with the Fitted Model'). So that I can get to the corresponding func that I have problem with instead of looking through the whole example.
  5. It is preferred to include a one-line docstring for test functions.
zgarciaj commented 10 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 1 hour 45 minutes


Review Comments

The package is exceptionally well-made, with its functions, vignettes, and test scripts standing out. The effort put into developing the functions is evident. Additionally, the package's overall design is very well done. The vignette is particularly informative, offering comprehensive usage instructions and effectively highlighting the features' functionality with a focus on technical metrics. However, there are some areas that could benefit from further attention:

  1. The file lacks an example for the cross_validate function. It would be helpful to include this for completeness. Also, the file is missing the docstring for the fit function.

  2. The vignette is very well done, as it clearly demonstrates the functions' capabilities and their practical real-world applications. The definitions/explanations of different feature types are particularly helpful for beginners. However, the final section of the vignette needs revision since the LaTeX equations for R^2 are not displaying correctly on ReadTheDocs. Additionally, a thorough review of the text for any grammatical or spelling errors is advisable.

  3. While I successfully installed the package, the installation instructions could be more beginner-friendly. It would be beneficial to start with guidance on cloning the repository using Git, followed by steps for installing Poetry, and ending with the pip install process. Further, adding a usage section in your README would enhance clarity and provide better guidance.

  4. After installing the package and running Pytest, although all tests passed, a warning message appeared. This may be due to the lack of instructions for setting up a virtual environment and installing necessary dependencies. Including a section on dependencies in the README would be advantageous.

charlesxch commented 9 months ago

Hi @jy1909 . Thank you for your review. We made the several changes based on your suggestions.

  • The docstring for the .fit() method is missing.

We added the docstring to the method. (https://github.com/UBC-MDS/OLS_regressor/issues/67)

  • I recommend adding examples to the docstrings.

All docstrings now contain examples to illustrate. (https://github.com/UBC-MDS/OLS_regressor/issues/68)

  • I would recommend including metadata for all authors in pyproject.toml.

All author names are included in the toml file right now. (https://github.com/UBC-MDS/OLS_regressor/issues/73)

  • According to the checklist, README.md needs more badges.

We have more badges righe now in the README.md. (https://github.com/UBC-MDS/OLS_regressor/issues/70)

  • I would recommend adding a usage section in inREADME.md. It is currently lacking demonstration of usage of the functions or a link to the vignette.

A usage section is added, with some simple illustration on the functions we defined. (https://github.com/UBC-MDS/OLS_regressor/issues/66)

  • I am encountering errors when pip installing package (following the instruction on README.md):

    • ERROR: Could not find a version that satisfies the requirement ols_regressor (from versions: none)
    • ERROR: No matching distribution found for ols_regressor
    • I see that you have the installation instruction of using poetry in CONTRIBUTING.md. This instruction works, but should be reflected in the installation instructions in README.md as well since using pip to install does not seem to work at the moment.

The error is bucause we did not publish the package onto PyPI at the time you were tring do intall. It is now available.

  • Please set up CI-CD. The ci-cd.yml file is missing.

Done with it. (https://github.com/UBC-MDS/OLS_regressor/issues/65)

  • I would recommend test for code coverage if this has not been done already since I do not see pytest-cov in the pyproject.toml file.

It is now added to the toml file. (https://github.com/UBC-MDS/OLS_regressor/issues/74)

charlesxch commented 9 months ago

Hi @AlysenTownsley . Thank you for your review. We made the several changes based on your suggestions.

Fix: Your latex equations for the R^2 at the end of your vignette did not render properly in ReadTheDocs.

We removed the latex in the notebook. It seems that the Sphinx cannot read our latex.

Fix: Cell 6 is quite a large code block for a beginner to understand. Recommend adding a comment before each step to link with the list of steps in the markdown cell above and improve readability.

Some break-down comments are added. (https://github.com/UBC-MDS/OLS_regressor/issues/80)

Fix: Several spelling errors (recommend passing through an editing tool).

We passed all words to Grammarly and then fix the spelling errors. (https://github.com/UBC-MDS/OLS_regressor/issues/79)

Fix: cross_validate.py - The code is difficult to read (minimal spacing between lines or comments to explain what the code is doing in each step ex/ exception handling vs. scoring / timing the model).

More space and comments are added to the file. (https://github.com/UBC-MDS/OLS_regressor/issues/69)

Fix: regressor.py - docstrings are required for all class methods including init() and fit(). Fix: cross_validate.py does not have an example of the function inputs/ outputs in the docstring.

Docstrings and examples are added. (https://github.com/UBC-MDS/OLS_regressor/issues/67)

Fix: Does not include code coverage – need to add the badge for this. Fix: Does not include repo status – need to add the badge for this. Fix: Does not include supported python versions – need to add the badge for this. Fix: Does not include current package versions – need to add the badge for this.

Badges are added. (https://github.com/UBC-MDS/OLS_regressor/issues/70)

Fix: Include a usage section in the ReadMe.

A usage section is added, with some simple illustration on the functions we defined. (UBC-MDS/OLS_regressor#66)

Fix: I was not able to install the package - more details below. Fix: I couldn’t run the tests, as I couldn’t install the package from the ReadMe instructions. Recommend to double check that your tests are working.

The error is bucause we did not publish the package onto PyPI at the time you were tring do intall. It is now available.

Fix: No instructions on how to run pytest.

Done with it. (https://github.com/UBC-MDS/OLS_regressor/issues/81)

Fix: CI workflow has not been set up in Github actions.

Done with it. (UBC-MDS/OLS_regressor#65)

charlesxch commented 9 months ago

Hi @weiranzhao97 . Thank you for your review. We made the several changes based on your suggestions.

It is preferred to also include example of usage code in README

A usage section is added, with some simple illustration on the functions we defined. (https://github.com/UBC-MDS/OLS_regressor/issues/66)

The example document is easy to understand and in detail. Personally It would be better to simplify the titles as name of the functions (e.g. Scoring() rather than 'Scoring the Fitted Model'; Predict() instead of 'Predicting with the Fitted Model'). So that I can get to the corresponding func that I have problem with instead of looking through the whole example.

Thank you for your thoughtful suggestion on simplifying the titles in our example document. We've opted to retain the more descriptive titles to support a story-telling approach that enriches the learning experience, believing it provides clear and comprehensive guidance. However, we appreciate your input and will consider ways to enhance navigability in future updates to accommodate quicker reference needs.

It is preferred to include a one-line docstring for test functions.

The docstrings are added now. (https://github.com/UBC-MDS/OLS_regressor/issues/83)

charlesxch commented 9 months ago

Hi @zgarciaj . Thank you for your review. We made the several changes based on your suggestions.

The file lacks an example for the cross_validate function. It would be helpful to include this for completeness. Also, the file is missing the docstring for the fit function.

Docstrings and examples are added to the functions. (https://github.com/UBC-MDS/OLS_regressor/issues/67, https://github.com/UBC-MDS/OLS_regressor/issues/68)

The vignette is very well done, as it clearly demonstrates the functions' capabilities and their practical real-world applications. The definitions/explanations of different feature types are particularly helpful for beginners. However, the final section of the vignette needs revision since the LaTeX equations for R^2 are not displaying correctly on ReadTheDocs. Additionally, a thorough review of the text for any grammatical or spelling errors is advisable.

We removed the latex in the notebook. It seems that the Sphinx cannot read our latex. We passed all words to Grammarly and then fix the spelling errors. (https://github.com/UBC-MDS/OLS_regressor/issues/79)

While I successfully installed the package, the installation instructions could be more beginner-friendly. It would be beneficial to start with guidance on cloning the repository using Git, followed by steps for installing Poetry, and ending with the pip install process. Further, adding a usage section in your README would enhance clarity and provide better guidance.

We did not publish our package to PyPI at the time of review, making it a bit complicated to do the package installation. Right now the package is available through pip insall so we believe that the installation is much simpler. Besides, a usage section is added, with some simple illustration on the functions we defined. (https://github.com/UBC-MDS/OLS_regressor/issues/66)

After installing the package and running Pytest, although all tests passed, a warning message appeared. This may be due to the lack of instructions for setting up a virtual environment and installing necessary dependencies. Including a section on dependencies in the README would be advantageous.

We are still working on this issues. We will keep updating our progress with you.