UBC-MDS / software-review

MDS Software Peer Review of MDS-created packages
1 stars 0 forks source link

Submission: pypuck (Python) #45

Closed jnederlo closed 4 years ago

jnederlo commented 4 years ago

Submitting Author: Jarvis Nederlof (@jnederlo), Xugang Zhong (@chuusan), Polina Romanchenko (@PolinaRomanchenko), Manish Joshi (@ManishPJoshi ) Package Name: pypuck One-Line Description of Package: This package provides wrapper functions to access the publicly available but undocumented NHL.com API. Repository Link: pypuck Version submitted: v1.1.0 Editor: Varada Kolhatkar (@kvarada) Reviewer 1: Ofer Mansour (@ofer-m) Reviewer 2: Manuel Maldonado (@manu2856) Archive: TBD Version accepted: TBD


Description

This package includes four main functions that access various endpoints on the NHL.com and NHL Records API's. In most cases the functions return data frames of the queried data, except for the attendance function, which plots the data. The four main functions are summarized as follows:

  • player_stats(start_date=None, end_date=None):
    • The player_stats() function makes an API call to the player summary endpoint on the NHL.com API. The function returns the top 100 player stats for a given date range as sorted by total points.
  • team_stats(start_season=None, end_season=None):
    • The team_stats() function makes an API call to the team summary endpoint on the NHL.com API. The function returns team seasonal stats for given seasons sorted by total team points.
  • draft_pick(pick_number=None, round_number=None, year=None):
    • The draft_pick(pick_number=None, round_number=None, year=None) function makes an API call to the drafts summary on the NHL.com API. The function returns information about draft picks for the specified arguments and stores them in a pandas data frame.
  • attendance(regular=True, playoffs=True, start_season=None, end_season=None):
    • The attendance() function makes an query to the Attendance API to get the NHL’s seasonal and playoff attendance numbers. The function displays attendance numbers in an Altair chart.

Scope

* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.

Our package retrieves data from the NHL.com and NHL Records API, wrangles the data into an appropriate form (pandas data frame and altair plot objects), and extracts relevant insights with which to return to the user.

Our package targets anyone interested in accessing data on the NHL.com API for personal and research purposes abiding by the NHL.com terms of service. It is not intended for commercial use.

There are a few packages that provide functions to access the NHL.com API including Hockey-scraper, nhlscrapi and nhl-score-api. However, most of them focus on accessing play-by-play data. Our package, meanwhile, focuses on providing the data in a usable form so the end user can extract some insight from direct function calls.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication options

JOSS Checks - [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements): "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Code of conduct

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

ofer-m commented 4 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 3


Review Comments

Feedback Suggestions:

Overall, you did an excellent job on the package, and I enjoyed using and reviewing it. If you have any questions regarding my review, please feel free to contact me.

manu2856 commented 4 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:

2.5 hours

Review Comments

Hi, everyone, my comments below: I enjoyed the simplicity, minimalism of the documentation and usage of the function. I had a couple of issues when installing. Messages such as: Could not find a version that satisfies the requirement..., but after installing the dependencies, the module was working fine.

I just have a couple of suggestions:

  1. Regarding pypuck.player_stats. I believe it could be helpful to inform users what is the oldest season that has data. I noticed that it is possible to input very old dates. Maybe it would good to display a warning message, whenever a user is trying to capture a range outside of available data range.
  2. Regarding pypuck.player_stats: I noticed that when inputing atypical season dates such as: pypuck.team_stats(start_season='18891985', end_season='20102011'), the function is still working. How is the function interpreting this input? And do you agree a validation should exist that guarantees startsean/endseason parameters are inputed as expected?
  3. Regarding pypuck.draft_pick: Everything seems to be working fine. Whenever I enter too ancient dates or future dates I get a coherent exception. Good job!
  4. Regarding pypuck.attendance: Everything seems to be working fine. Nice minimalistic visualization. Just as an idea about this function: it would be a nice feature for the user to have access to the data in form of a dataframe. (For the user to do the plot he pleases). What about having a parameter that makes the user decide if he wants a plot or a dataframe an output, also a parameter deciding if the data should be aggregated or shown by game?

Please let me know what to you think about this.

By looking at how good and clean your function works, I can only imagine the data that could be extracted in other sports as well!

jnederlo commented 4 years ago

Thanks for your feedback and suggestions. We went through and addressed most of your suggestions, in-fact we addressed all of the suggestions that didn't introduce new features as those can/should be addressed in future major releases. To specifically summarize what we addressed:

@ofer-m: Addressed:

Did Not Address:

The suggestions that we did not address are good suggestions that we will put on our radar for future releases.

@manu2856: Addressed:

Did Not Address:

You can view these changes as part of release v1.1.2.