conda-incubator / conda-recipe-manager

A project for libraries and automated tools that manage and manipulate conda recipe files.
BSD 3-Clause "New" or "Revised" License
11 stars 8 forks source link

Develop a CRM script to scan feedstocks for dependency/import name mis-matches #181

Open schuylermartin45 opened 1 month ago

schuylermartin45 commented 1 month ago

Credit goes to @cbouss for suggesting this.

We could use CRM to develop a script that scans a large set of feedstock repositories to find cases where the import name does not match the conda package name.

Example: the pillow library uses PIL as the import name.

That could greatly increase the accuracy of the newly introduced PythonDependencyScanner class (see #180)

schuylermartin45 commented 1 month ago

This script has been started, but it is unclear if it is worth pursuing when this database exists: https://github.com/regro/cf-graph-countyfair/tree/master/import_to_pkg_maps

schuylermartin45 commented 1 month ago

~From my conversations in the last few weeks, this is still useful. At the very least, we need some more POC material before we can rule it out.~ Scratch the previous statement, this tool is probably still useful, but I think it is more important that we parse the data from CF's county fair repo.

schuylermartin45 commented 1 month ago

After looking into conda-pypi, I discovered the CF import mapping data was actually publishing to a JSON file that was much easier to parse and utilize.

So this ticket is pivoting to develop a script that can fetch and cache this data to a format that CRM can easily leverage.

schuylermartin45 commented 1 week ago

This is blocked by either figuring out the licensing details from legal OR by waiting for/build an API in the conda-forge-metadata package. Caching a file derived from the API and comiting it to the repo probably also needs a request from legal.