package-url / purl-spec

A minimal specification for purl aka. a package "mostly universal" URL, join the discussion at https://gitter.im/package-url/Lobby
https://github.com/package-url/purl-spec
Other
686 stars 159 forks source link

What do purls do about different artifacts for the same version? #11

Closed tgamblin closed 6 years ago

tgamblin commented 6 years ago

This is somewhat related to #10 and #5.

Often, package maintainers re-release a new tarball for the same version and overwrite the old version at the same URL. Yes this is bad, but it happens in real life.

How should purls handle this? Should this just remain ambiguous? Should a (specific type of?) archive digest be included as a qualifier to disambiguate? e.g., should we standardize ?sha256=blahblah? Example: suppose the first release of foo version 1.0.0 has a vulnerability, but the developers push a new tarball after a few days without bumping the version number?

Seems like there should at least be a section in the spec for how this is handled (or not handled).

Note: there are other ways you could get different artifacts for a single version, e.g. some packages have different release archives for different platforms. I assume qualifiers will be used to disambiguate in that case.

pombredanne commented 6 years ago

@tgamblin you wrote:

Often, package maintainers re-release a new tarball for the same version and overwrite the old version at the same URL. Yes this is bad, but it happens in real life.

ARGH! this is a clear violation of the prime directive of package management and this makes my heart blead :cry:

Now you are right, these goofs happen. In these cases as you suggested the only way out is likely to use some checksum as a qualifier. There is a provision for this for the generic type as in:

generic:openssl@1.1.10g?download_url=https://openssl.org/source/openssl-1.1.0g.tar.gz&checksum=sha256:de4d501267da3931090

And this qualifier can be used by any type as this is also detailed there https://github.com/package-url/purl-spec/tree/a21a9ecbab171fe06d1ac88843bc3721a42dbed0#known-qualifiers-keyvalue-pairs

.... which segways into your #10 ticket I guess.

Seems like there should at least be a section in the spec for how this is handled (or not handled).

Would you care for PR or a FAQ entry to explain this use case clearly? https://github.com/package-url/purl-spec/wiki/FAQ

there are other ways you could get different artifacts for a single version, e.g. some packages have different release archives for different platforms. I assume qualifiers will be used to disambiguate in that case.

Yes, that the whole purpose of qualifiers, including god forbid being able to support the python environment markers if needed as in this correct yet ugly lxml https://pypi.python.org/pypi/lxml/4.1.1:

lxml-4.1.1-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl

I guess in this case using cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64 as a single qualifier might be better than breaking it down in it pieces

This is a pending item in the current draft in the pypi for Python packages: type section BTW at https://github.com/package-url/purl-spec/tree/a21a9ecbab171fe06d1ac88843bc3721a42dbed0#known-purl-types :

TBD: we could specify a format qualifiers key to specify a package format with values of egg, wheel , sdist, exe or may be a file extension?

TBD: we could specify a markers qualifiers key to specify PEP 508 environment markers but this is extra complexity. See https://www.python.org/dev/peps/pep-0508/

jackfirth commented 6 years ago

Often, package maintainers re-release a new tarball for the same version and overwrite the old version at the same URL. Yes this is bad, but it happens in real life.

ARGH! this is a clear violation of the prime directive of package management and this makes my heart blead 😢 Now you are right, these goofs happen.

There are valid reasons for this to happen intentionally:

And I'm sure there's plenty of other use cases not thought of yet. In general, URLs should avoid promising anything about the permanence of retrieved representations because URLs are forever. The only time a URL should mandate a particular representation be returned is with some sort of digest or hash included in the URL - nothing else is reliable.

pombredanne commented 6 years ago

And I'm sure there's plenty of other use cases not thought of yet. In general, URLs should avoid promising anything about the permanence of retrieved representations because URLs are forever. The only time a URL should mandate a particular representation be returned is with some sort of digest or hash included in the URL - nothing else is reliable.

So I think we are covered alright now: A qualifiers checksum can be included if needed or not, but purl makes no promise beyond this.

We can then deal with repackaging with the same exact version ass may happen in spack for private "personal" builds like that @tgamblin was coerced into supporting.

Is there anything that would not be covered then?

tgamblin commented 6 years ago

Is there anything that would not be covered then?

Seems reasonable to me.

pombredanne commented 6 years ago

@tgamblin do you want to close this then?

tgamblin commented 6 years ago

Done!