conda-forge / conda-forge-metadata

programatic access to conda-forge's metadata
BSD 3-Clause "New" or "Revised" License
4 stars 7 forks source link

Request: Expose the `import` to `conda package` name mapping table in a more cost-effective way. #55

Open schuylermartin45 opened 1 month ago

schuylermartin45 commented 1 month ago

I have a need in conda-recipe-manager to acquire a mapping of import names to conda package names.

Currently, this project supports querying the API for a single string look-up: https://github.com/conda-forge/conda-forge-metadata/blob/18af7dc39d37ccdc5dc3a13f91ffcddfd7cee36d/conda_forge_metadata/autotick_bot/import_to_pkg.py#L104

For what we plan to do with conda-recipe-manager, this could easily turn into a very large number (at least in the thousands) of API requests a day. For data that seems to change pretty infrequently, that seems excessive and costly.

I would much rather have a local cache of this mapping data, but periodically update it through a new API in this project.

The current size of the JSON file containing this mapping in cf-countyfair is ~800kb. I have gotten that down to about 300kb by removing some redundant fields. If we expect this list to grow significantly, the new query may need to be built with pagination in mind.

See this conversation for additional context: https://github.com/conda-incubator/conda-recipe-manager/pull/218

tl;dr I would like to request a new API endpoint that exposes the import mapping data currently available in cf-countyfair.

maresb commented 1 month ago

Regarding the caching, I just added some logic to conda-lock for caching the lookup table.

https://github.com/conda/conda-lock/blob/main/conda_lock/lookup_cache.py https://github.com/conda/conda-lock/blob/main/tests/test_lookup_cache.py

In practice with the download being so tiny, I'm not sure how worthwhile this is, but I'd be happy to add it to conda-recipe-manager.

schuylermartin45 commented 1 month ago

I'm less concerned about the size and more concerned about redundancy and how frequently I will need to hit the API.

The fact that it is so small and doesn't get updated often makes me think a local cache is very reasonable.

beckermr commented 1 month ago

Feel free to use a local cache, but conda-recipe-manager needs to pull its cache updates using an api (that we need to add of course) in this package.