jldbc / pybaseball

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
MIT License
1.26k stars 333 forks source link

Update lahman for changes to directory structure #251

Closed MichaelGFagan closed 2 years ago

MichaelGFagan commented 2 years ago

This PR updates pybaseball.__init__.py and pybaseball.lahman.py to comply with changes made to the recent release of the Lahman/Baseball-Databank database.

schorrm commented 2 years ago

Thank you so much for the PR! Two quick questions:

  1. What's the difference between team as it was and team_core() and teams_upstream?
  2. Should it map to one with a deprecation warning?
MichaelGFagan commented 2 years ago

Thank you so much for the PR! Two quick questions:

  1. What's the difference between team as it was and team_core() and teams_upstream?
  2. Should it map to one with a deprecation warning?

The commit message on the upstream side is: "Added 2021 manually-managed teams data."

Their readme says this about the new directory structure:

Organisation of the files There are three directories in the repository.

core/contains the databank itself. These files are automatically produced from our larger dataset. contrib/ contains files which are manually maintained by others using the same identifier system as the core. We bundle these for the convenience of the community. upstream/ contains files used to construct the databank.

And there's one other note here:

Files in core/ are all generated by scripts. As such they are not edited manually (and therefore pull requests should not be submitted against these files).

Files in upstream/ are manually-maintained files which contain information specific to constructing the Databank. As they are maintained manually, it is valid to submit pull requests containing corrections or additions to these files.

I don't see a "right" way to answer this. Leaving just the core version would maintain continuity; providing both would give options to end users.

schorrm commented 2 years ago

Then why don't we have team() map (w/o warning, I guess no need to deprecate) to team_core, and include team_upstream as well?

MichaelGFagan commented 2 years ago

@schorrm Sorry for the delay. My wife and I just had our first child! Anyway, made the update you requested!

schorrm commented 2 years ago

Mazel Tov! Thank you