chanzuckerberg / napari-hub

Discover, install, and share napari plugins
MIT License
51 stars 18 forks source link

Transition `releases` data off of non-versioned API #608

Open richaagarwal opened 2 years ago

richaagarwal commented 2 years ago

This is a follow-up to https://github.com/chanzuckerberg/napari-hub/issues/598, in which we implemented a hotfix for a breaking change introduced in the pypi API we query, which removed the releases key from the versioned API endpoint we were using. More information here: https://github.com/pypi/warehouse/pull/11775. For the hotfix, we changed the query to hit the non-versioned API endpoint instead, with the knowledge that the field is considered deprecated there as well, though it has yet to be removed.

As this comment notes, there's no current timeline to remove releases from the non-versioned API endpoint, but it would be prudent to start thinking about a transition plan. In #598 I outlined these two options:

Option 1: Re-introduce fields by using the recommended simple API instead. This in turn could be broken out into two parts reintroducing just those two fields, and then later possibly re-working all of format_plugin to rely on the simple API. (Ideally these would both be done at once, but depending on how important it is to get back to populating these fields, we could delay the latter work). Option 2: If accessing upload_time_iso_8601 from the urls array is a reliable source for the release_date (which it appears to be), we may not need to switch APIs at all, and instead could re-work how we handle first_released. Ideally, we would only populate first_released the first time we grab data for a plugin, in which case it would be the same as release_date with no need to ever get previous version releases in any given request.

It turns out that option 2 is not very straightforward given our current S3 architecture for storing data, so I'd recommend that we revisit this work when we are ready to prioritize moving to a database.

@neuromusic let's connect on this when you're back!

richaagarwal commented 2 years ago

This work may end up addressing the bug reported in #611 as well, as we have a hypothesis that the lag reported there is due to PyPI's non-versioned endpoint taking a while to catch up to the latest release's data.

neuromusic commented 2 years ago

note: in order to support #712, #702, and #703, we'll need to ingest ALL release dates from PyPI

richaagarwal commented 1 year ago

This is currently blocked as the simple API may not be an option, as it doesn't support upload time at the moment (see https://github.com/jwodder/pypi-simple/issues/5).