BurntSushi / nfldb

A library to manage and update NFL data in a relational database.
The Unlicense
1.08k stars 264 forks source link

Bulk Upload 2017-2019 nfldb-update #314

Closed seanpdwyer7 closed 4 years ago

seanpdwyer7 commented 4 years ago

Hi,

Does anybody know how to start a bulk update to back fill data from nfldb-update, as it is only updating for the current week?

Thanks!

ochawkeye commented 4 years ago

Try python c:\Python27\Scripts\nfldb-update --update-schedules

C:\Users\Ben>python c:\Python27\Scripts\nfldb-update --help
usage: nfldb-update [-h] [--interval INTERVAL]
                    [--player-interval PLAYER_INTERVAL] [--update-schedules]
                    [--batch-size BATCH_SIZE]
                    [--simulate SIMULATE [SIMULATE ...]]

Updates the nfldb database. It may be run at any frequency, or it may be run
in the background with --background.

optional arguments:
  -h, --help            show this help message and exit
  --interval INTERVAL   When set, nfldb-update will check for active games and
                        update the database every N seconds, where N is the
                        interval given. You should NOT specify an interval
                        smaller than 15 seconds, since NFL.com's JSON feed is
                        updated approximately every 15 seconds. (default:
                        None)
  --player-interval PLAYER_INTERVAL
                        The number of seconds between player meta data
                        updates. A longer interval is needed since meta data
                        does not change frequently and because each update
                        requires a few dozen HTTP requests to NFL.com.
                        (default: 43200)
  --update-schedules    When set, ALL game schedules are refreshed from the
                        data in nflgame. (In normal operation, only the
                        current week's schedule is refreshed.) (default:
                        False)
  --batch-size BATCH_SIZE
                        The number of games to batch before sending data to
                        the database. In normal operation, this setting will
                        not have much effect and so should be kept low (to
                        keep memory requirements low). It is most useful when
                        updating a large amount of data.e.g., A batch size of
                        150 seems to work well when building the database from
                        scratch. (default: 5)
  --simulate SIMULATE [SIMULATE ...]
seanpdwyer7 commented 4 years ago

When I run that it doesn't return any historical data in the JSON file. Maybe I'm doing it wrong? Screen Shot 2019-10-31 at 7 49 32 PM

ochawkeye commented 4 years ago

Try force updating your nflgame schedule? python nflgame-update-schedule --rebuild

c:\Python27\Scripts>python nflgame-update-schedule --help
usage: nflgame-update-schedule [-h] [--json-update-file JSON_UPDATE_FILE]
                               [--rebuild] [--year YEAR]
                               [--phase {PRE,REG,POST}] [--week WEEK]

Updates nflgame's schedule to correspond to the latest information.

optional arguments:
  -h, --help            show this help message and exit
  --json-update-file JSON_UPDATE_FILE
                        When set, the file provided will be updated in place
                        with new schedule data from NFL.com. If this option is
                        not set, then the "schedule.json" file that comes with
                        nflgame will be updated instead. (default: None)
  --rebuild             When set, the entire schedule will be rebuilt.
                        (default: False)
  --year YEAR           Force the update to a specific year. (default: None)
  --phase {PRE,REG,POST}
                        Force the update to a specific phase. (default: None)
  --week WEEK           Force the update to a specific week. (default: None)

c:\Python27\Scripts>
seanpdwyer7 commented 4 years ago

Figured it out I ran the script from ~/nflgame/scripts/update_sched.py --year 201X and then ran nfldb--update and the the script updated and it sent the batch files to Postgres