Closed crflynn closed 3 years ago
The downloads
table appears to have stopped updating, however there is another table called file_downloads
which appears to have the same schema/partition and has been updating.
Looks like file_downloads
is newer and stable: https://twitter.com/sethmlarson/status/1347236470688542721
Currently (slowly) backfilling data since Jan 4.
We're caught up, at least since Jan 4. Based on the lack of weekly cadence in older data it looks like the data producer has been broken/unreliable for a while. :(
According to linehaul maintainers it looks like the newer implementation is better and the data is at least somewhat recoverable if upstream processing fails.
Thanks crflynn!
It looks like the Google BigQuery table for downloads are missing partitions for the following dates
This corresponds to the recent missing data shown on pypistats.org visualizations here: downloads
So this must be an issue upstream related to either
Will do some research upstream...