crflynn / pypistats.org

PyPI downloads analytics dashboard
https://pypistats.org/
138 stars 10 forks source link

Missing data since ~2021-03-22 #34

Closed crflynn closed 3 years ago

crflynn commented 3 years ago

It looks like the Google BigQuery table for downloads are missing partitions for the following dates

This corresponds to the recent missing data shown on pypistats.org visualizations here: downloads

So this must be an issue upstream related to either

Will do some research upstream...

crflynn commented 3 years ago

The downloads table appears to have stopped updating, however there is another table called file_downloads which appears to have the same schema/partition and has been updating.

crflynn commented 3 years ago

Looks like file_downloads is newer and stable: https://twitter.com/sethmlarson/status/1347236470688542721

crflynn commented 3 years ago

Currently (slowly) backfilling data since Jan 4.

crflynn commented 3 years ago

We're caught up, at least since Jan 4. Based on the lack of weekly cadence in older data it looks like the data producer has been broken/unreliable for a while. :(

According to linehaul maintainers it looks like the newer implementation is better and the data is at least somewhat recoverable if upstream processing fails.

jewettaij commented 3 years ago

Thanks crflynn!