aboutcode-org / dejacode

Automate open source license compliance and ensure software supply chain integrity
https://dejacode.readthedocs.io
GNU Affero General Public License v3.0
21 stars 7 forks source link

Do not strip VCS tags leading "V" and do not use a "version_prefix" PURL qualifier #153

Closed pombredanne closed 1 month ago

pombredanne commented 1 month ago

Describe the bug Do not strip VCS tags leading "V" or "v" as it creates weird and impossible to use PURLs.

See also:

To Reproduce Steps to reproduce the behavior:

  1. Create a package from https://github.com/elliotchance/orderedmap/archive/refs/tags/v1.6.0.tar.gz
  2. The PURL is pkg:github/elliotchance/orderedmap@1.6.0?version_prefix=v

Expected behavior PURL should be pkg:github/elliotchance/orderedmap@v1.6.0 We should reuse tags as-is.

DennisClark commented 1 month ago

The proposed change in this issue sounds like a good idea to me, but I think it might require some major data upgrades to all the AboutCode projects?

DennisClark commented 1 month ago

I am going to guess that the stripping of prefix literals from a package version was a conscious decision at some point in order to facilitate version range sorting and comparisons by "normalizing" the version to the numeric value only. In hindsight, that may not have been a good idea, but changing/fixing that to do something else might also have an impact on any of the AboutCode tools that work with versioning.

mjherzog commented 1 month ago

We need to determine which types of packages are affected - perhaps starting with the set of Packages in the DejaCode reference data. This seems to be characteristic of GitHub (including the AboutCode repos) with a "v" in the Download URL and uncommon for many package types like maven, npm or pypi. I suggest two follow up actions:

pombredanne commented 1 month ago

My point is that when the package version comes from a git tag it should NEVER be stripped from a suffix. Tags (and versions in general) should NOT be modified as this does not help with anything but impair portability.

The tags themselves should not be stored as modified to accommodate either of these concerns. Only DejaCode has this issue AFAIK, so this impact not much else though this was introduced in https://github.com/package-url/packageurl-python/blob/8fac7180870c9163cb7a3d6446a7a00150990598/CHANGELOG.rst#0104-2022-10-17

@tdruez we need to find a way to revert that change

DennisClark commented 1 month ago

@tdruez Everything looks good on staging. I tried multiple variants, including a package without release tags, and everything was generated without problems. I also did a number of "round trips" between dataspaces using purl values and that all worked great as well. The re-genned packages are all valid, as far as I can tell. I think it's ready to deploy, and thanks for taking care of this.

DennisClark commented 1 month ago

On the other hand (and you undoubtedly know this already) the issue reported regarding pkg:github/apache/nifi@2.0.0-M3 in issue #149 is still a problem.

pombredanne commented 1 month ago

On the other hand (and you undoubtedly know this already) the issue reported regarding pkg:github/apache/nifi@2.0.0-M3 in issue #149 is still a problem.

@DennisClark the version should be exactly rel/nifi-2.0.0-M3 because this is the tag value per https://github.com/apache/nifi/releases/tag/rel%2Fnifi-2.0.0-M3

The only way out of this would be to track the actual tagging scheme for each and every package which sounds like a wild goose chase. Or just have the logic for all packages in one place which is what https://github.com/nexB/purldb/tree/main/purl2vcs does exactly

tdruez commented 1 month ago

Fix merged and deployed.