UBC-MDS / software-review-2024

0 stars 0 forks source link

Group 16 - Knn model #23

Open billwan96 opened 5 months ago

billwan96 commented 5 months ago

Submitting Author: Bill Wan(@billwan96) All current maintainers: (@zhang-shizhe, @sho-i98, @weiranzhao97) Package Name: pyknnmodel: A KNN model for binary classification Repository Link: https://github.com/UBC-MDS/Group16_knnmodel Version submitted: v2.0.0 Editor: @ttimbers Reviewer 1: Hayley
Reviewer 2: Nasim Reviewer 3: Kittipong Reviewer 4: Jing

Archive: TBD JOSS DOI: TBD Version accepted: TBD Date accepted (month/day/year): TBD


Code of Conduct & Commitment to Maintain Package

Description

Scope

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo

Community Partnerships

If your package is associated with an existing community please check below:

[^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication Options

JOSS Checks - [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Confirm each of the following by checking the box.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

yhan178 commented 5 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 1.5


Review Comments

Nice work. Your documentation (README, vignettes) in general is clear and easy to follow. The package can be install and run smoothly.

I only have some minor revision suggestions on the documentation.

should be embedded in this section rather than the "Preprocessing" section.

Jing-19 commented 5 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:


Review Comments

The User Guide is well-written and easy to follow for someone unfamiliar with the KNN classifier. I was able to follow the instructions to download the package and use it locally. The coverage is good and all the tests are passed. I like the icon made for the package, which is very creative! Some minor changes I suggest below to make your user guide flow more smoothly:

  1. There is rich detail in the README which is good for the user to get a full understanding of the package. For an added layer of clarity, consider wrapping each function name within backticks (`) to differentiate them more clearly in the text.
  2. Still in the README, scaling down the logo could refine the visual balance of the guide.
  3. Upon inspecting the root directory, I noticed the scaling.py file is currently located there. It might be more in keeping with conventional project structure to move this file into the src/ directory, to maintain a clean separation of concerns.
  4. The initial cell of the example usage file contains a version print statement (print(pkg_pyknnclassifier.__version__)). It could be beneficial to relocate this to a separate cell to maintain the import cell's focus, or it might be streamlined away if deemed non-essential.
  5. In the example file, referencing the iris dataset towards the end of the example file is a wise choice.
  6. Also in the example file, displaying the head of the dataset in a table format would give readers a quick snapshot of the data structure at a glance.
  7. Additionally, pruning any commented-out code could tidy up the examples, ensuring that each line serves a purpose.
nassimgha commented 5 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:


Review Comments

Hi, overall it looks great and works without any issues. A few suggestions for improvements though:

  1. On readme file, the installation part, it is better to mention this here as a part of the installation guide that the user needs to clone the repository to their local machine by running: "git clone ... " before running poetry install. Testing section on readme file: Line coverage can also be added here. The example provided on readme file could also include a short demonstration of the results.
  2. When running the usage example, I got an error as follows: "NameError: name 'evaluate' is not defined". Also, we needed to read the toy data from data folder inside the repository, which seems needed, but for the short example, maybe a toy dataset in the code could help avoid any errors regarding reading data from another file.
  3. In vignettes, there are codes that are commented out, specifically the line excluding the id column. I think it is better if you delete the lines that are not needed for better readability.
jokittipong commented 5 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:


Review Comments

I've got some suggestions on how to improve your package:

  1. There is some mistypo(s) on README.(@Functions: This pacakge...)

  2. The installation instruction doesn't correctly instruct an user to install the package. You skip the step to install poetry inside the conda environment pyknnclassifier, you should add pip install poetry before give us instruction to run poetry install or give instruction like If you dont have poetry installed in your base environment, you can follow the installation guide(with provided link) for poetry.

  3. It would be great if you add on the instruction on how to clone your repository to user's local machine. such as Clone the github repository using: git clone https://github.com/UBC-MDS/Group16_knnmodel.git

  4. No consistency in the code block, some has the $, some has not. In my opinion, you should cut off $ because when an user copies your command from the README, an user doesn't need to erase the $ before running a command line.

  5. Your instruction to run the tests is not provided good details on your tests. In my opinion, it would be better if you provided the tests via poetry: poetry run pytest tests/ --cov=src/pkg_pyknnclassifier/ instead of pytest tests/

  6. The test coverage report shown that there are some code which haven't been covered by the tests on calculate_distance.py (75%)(branch cover 67%), evaluate.py (97%)(branch cover 97%), and scaling.py(73%)(branch cover 71%).

  7. From the Usage example in README, First I can't run the line:

    accuracy_result = evaluate(target, pred, metric='accuracy') Traceback (most recent call last): File "", line 1, in NameError: name 'evaluate' is not defined

As you forgot to provide the import line on evaluate function in the example, so you should add this line in example: from pkg_pyknnclassifier.evaluate import evaluate

  1. In the vignettes (example.ipynb): For the data-loading: in my opinion, You should cut off the "Preprocessing" head line or using different font size , as it may confused an user because you have tried to explain your package function, not the process on data wrangling For the Scaling explanation: it would be great if you provide some briefly on how you scale from the function. For the Prediction: it would be better if you print out the prediction to show us the form of prediction you give from the package, as you did in evaluate.

Anyways, thanks for your hard work!!! Also, I really love how you put the big logo picture at the top and some small logo on each topic. It makes your README look really nice and fun to follow your instruction.

P.S. I have not checked the box about following PEP 8 guidlines as I'm not able to tell and I think you don't linting set up it and it would be nice if you provide a separated link to your read the docs page, even you have posted the badge that can give access to read the docs page.