Open mensfeld opened 6 months ago
The task that ensures consistency was disabled due to poor performance in... 2021 🙃
https://github.com/pypi/warehouse/pull/10256
But was never subsequently re-enabled that I can tell, as the contributor never returned to address the issue.
For triage, I have manually run this task, can you confirm if you're seeing consistency?
@ewdurbin was all the data synced? That is, should all the historical gaps be filled now?
When I query virtualenv
I'm still missing 20.25.2+
versions (anything newer).
Is there any other endpoint to get the recent releases data?
@ewdurbin I'm still not seeing the newer releases of virtualenv
in the BigQuery dataset :(
Hmmm, unclear what the issue is. @di are you familiar with why the sync wouldn't capture past releases?
That's not the job that inserts new metadata, that job just syncs missing metadata if insertion fails for some reason.
Insertion of new metadata happens on upload: https://github.com/pypi/warehouse/blob/main/warehouse/forklift/legacy.py#L1222-L1223
The timeline here is suspiciously close to when we did some migrations on these schemas, my guess is that the update_bigquery_release_files‎
is failing and we're unaware.
So sync_bigquery_release_files
is not the bulk equivalent of update_bigquery_release_files‎
?
It is, but it shouldn't be necessary anymore, metadata should be reliably getting inserted on upload (but it appears it isn't anymore).
hm, okay I ran sync_bigquery_release_files
in an attempt to triage and it seems it didn't bulk load missing info. seems this needs some more investigation.
Probably failing for the same reason the individual job is failing I would venture a guess!
Running this query:
misses several new versions available here: https://pypi.org/project/virtualenv/#history released in April and May. It's similar for some other packages.
Describe the bug
All versions info should be available in BigQuery.
Expected behavior
I would expect them (except eventual consistency ofc) to be available in BQ.
To Reproduce
Run in BigQuery:
and see versions are missing.