ecosyste-ms / packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
https://packages.ecosyste.ms
GNU Affero General Public License v3.0
26 stars 5 forks source link

incorrect `repository_url` for `pypi.org/packages/easybuild-easyconfigs` #652

Closed boegel closed 8 months ago

boegel commented 8 months ago

The JSON data for pypi.org/packages/easybuild-easyconfigs shows https://github.com/easybuilders/easybuild-easyblocks as value for repository_url, while it should be https://github.com/easybuilders/easybuild-easyconfigs.

It seems like this field is filled in based on a heuristic, because as far as I can tell this info is not directly available from PyPI... It's weird though that the easybuild-easyblocks repo is picked up, since a more obvious guess/match would be https://github.com/easybuilders/easybuild-easyconfigs.

I noticed that documentation_url is also wrong (should be https://docs.easybuild.io instead of https://easybuild-easyconfigs.readthedocs.io), but there I understand that it is just an incorrect educated guess, and that we should fix the metadata published through PyPI to resolve this.

Upvote & Fund

Fund with Polar

andrew commented 8 months ago

I'm falling back to trying to find a repository url from the Project description text as there isn't a repository url in the package metadata (it would have used that if it was available)

The heuristic I set up was find all the github/gitlab/bitbucket urls in the description and choose the most frequently used one, in this case easybuild-easyblocks is linked twice in the description, while easybuild-easyconfigs is only linked once.

This can definitely be improved to take into account exact matches between the package name and the repository slug, thanks for reporting!

The documentation url isn't so smart right now, it simply does "https://#{package.name}.readthedocs.io/" for every python package, which can also be improved in pypi's case as there could well be a Documentation project_url, for example in django: https://pypi.org/pypi/django/json

    "project_urls": {
      "Documentation": "https://docs.djangoproject.com/",
      "Funding": "https://www.djangoproject.com/fundraising/",
      "Homepage": "https://www.djangoproject.com/",
      "Release notes": "https://docs.djangoproject.com/en/stable/releases/",
      "Source": "https://github.com/django/django",
      "Tracker": "https://code.djangoproject.com/"
    },
boegel commented 8 months ago

Thanks for clarifying!

I think a "direct guess" based on package name would make more sense, but then you would still need to figure out the GitHub organization of course...

andrew commented 8 months ago

I've improved the description matching in https://github.com/ecosyste-ms/packages/commit/65d60740eb9afc5217d2080c5eda4ff12af97bd5 and resynced easybuild-easyconfigs, it may take a little while for all other pypi package descriptions to update if they are incorrect though, do drop a comment if you notice any others and I'll update them too.

andrew commented 8 months ago

Also improved support for custom pypi documentations links in https://github.com/ecosyste-ms/packages/commit/1a1e3a05f79fedc5ed7faca0831c233f2f2471cb, again will take some time to update for all pypi projects