NuGet / Home

Repo for NuGet Client issues
Other
1.48k stars 249 forks source link

NuGet is missing package revision management #10479

Open gh-andre opened 3 years ago

gh-andre commented 3 years ago

This is a feature request, so no steps, logs, etc.

Most NuGet packages are created not by the software maintainers, but rather by people consuming those packages or making them available for consumption for other projects. This creates a significant problem with the non-existing package versioning in NuGet, as opposed to the packaged software versioning. NuGet only supports the latter, making it impossible to cleanly repackage applications.

If you look at other packaging formats, such as RPM, RPM/RedHat, Debian, APK/Alpine Linux, you will see that there is an upstream version, which is the software version, as assigned by the software maintainers, and there's a package version, which is denoted as release in RPM or _debianrevision for Debian, which is assigned by package maintainers. This additional revision allows package maintainers to release new packages with the same software version, which is missing from NuGet.

I've been packaging with NuGet 3rd-party libraries for while and more often than not a package needs to be updated throughout the lifetime of the consuming project, but the change is often in how the same software version is integrated into the project, not in the enclosed software version. For example, I may need to change user macros in a package or may move files in packaged directories or even rebuild the same version of the packaged software with new compile flags or even a patch, if the upstream maintainer didn't provide one.

Software version does not change in any of the steps described above, package does, but the only way today to handle this is to use the prerelease version, which produces a nasty warning that it's a pre-release version in the package registry, even if it has nothing to do with the package software pre-release.

If NuGet supported package versions, this problem would be nicely solved and people could update packages in a predictable way without resorting to pre-release hacks, etc.

For example, using loose Fedora/Debian notation (I'm actually not suggesting the syntax, only the concept - actual syntax will take some thinking) and the last dash denotes the version of the package, then my package file with the version 18.1.25 of the enclosed library might look like this:

If the upstream maintainer wanted to make a pre-release, they could continue using the Semantic Versioning, like the following, because the last dash would always mean the package revision:

You can probably run a quick query against the package repository to see how many owners match authors, and any difference in those will show you how many people are using hacky ways to package software.

I hope this makes sense.

nkolev92 commented 3 years ago

Thanks for filling this issue @gh-andre.

Can you elaborate more on the particular behavior you would expect? Maybe by providing a short practical example? Unfortunately I'm not sure if which you are proposing this gets solved.

For example, I may need to change user macros in a package or may move files in packaged directories or even rebuild the same version of the packaged software with new compile flags or even a patch, if the upstream maintainer didn't provide one.

Note that from a package perspective, changing a compiler flag or adding a patch, it cannot make it the same version. The code is different, the output assembly would be different and as such as the package must be different.

gh-andre commented 3 years ago

@nkolev92

Thanks for responding.

There are two different points that it seems I didn't manage to convey in the original post. Let me try to rephrase them separately.

Semantic Versioning vs. Package Revision

Note that from a package perspective, changing a compiler flag or adding a patch, it cannot make it the same version. The code is different, the output assembly would be different and as such as the package must be different.

This actually doesn't align at all with Semantic Versioning, which tracks what the code does functionally, not how it is packaged. You are right, however, the package must be different, except that not in the code version, but in package revision, so each package is unique and never changes.

For example, if I publish a Nuget package built without the optimization flag, but then decided to release another package with optimized binaries, it will indeed be a new binary, but the code inside is functionally exactly the same. One simply doesn't put in Release Notes that the package was rebuilt with better optimization. Some people bump up the patch level, but they are simply violating Semantic Versioning, guidelines which define the level of code changes in relation to the previous release and has nothing to do with packaging.

This is exactly what package revisions would solve - one can repackage the same code and update the package revision for the same packaged code version. Package revision can be as simple as various numeric sequences used by many Linux package managers or more complex, like what appears Conan's package revision hashes and pinning achieve (although I haven't used it).

Packaging Own vs. Other People's Code

Things get more complicated when one packages other people's code.

Some package managers are rarely used for packaging other people's code. For example, I never saw in mainstream NPM usage other people's code, so it's probably not as relevant to that package manager.

Other package managers, however, do this all the time. Type zlib in nuget.org and you will see 100+ packages created by other people than those who maintain zlib library code. How can package maintainers repackage same upstream code?

Here's a more practical example. I maintain a couple of Open Source projects. One of them uses Berkeley DB, Freetype, MaxMindDB, zlib and libGD. None of these libraries is available in Nuget, so I can either hunt other people's packages and hope they know how to build them or build my own, which is what I do. Problem is, as my own project evolves, I need to repackage other libraries, which all have the same version as before, but I may need to either compile them differently or include some of their extra files into my package. The only way to do this today with Nuget is to use pre-release versions, which is a violation of the spirit of Semantic Versioning and has nothing to do with pre-releases, but the only alternative is to produce new package IDs, which is even worse, as there is no package history anymore.

Linux package maintainers routinely package other people's code and they solved this problem long before many modern package managers came along. They maintain a package revision in addition to the upstream code version, which is exactly what I'm describing here - it's a revision of the package, not the code inside.

I hope this makes it a bit more clear.

nkolev92 commented 3 years ago

Thanks for the details. That definitely adds more details to your motivation.

What I'm curious about is how you see your specific proposal working in the context of NuGet. In particular, what'd you expect to happen when directly referencing a package, what'd you expect to happen when transitively referencing a package etc.

Is the proposal: "get latest revision for that specific version each time you go online" ?

gh-andre commented 3 years ago

I use Nuget for native packages and cannot speak for how it is used for .Net development. For my usage pattern, which spans commercial and Open Source projects, a simple numeric package revision would work well, including applications with multiple supported released versions (as opposed to website-style development, where only one active version is supported).

Here's how I would see it.

Package name nomenclature will need some thinking to find a good pattern that wouldn't interfere with Semantic Version delimiters. For example, a pre-release version would be separated by a hyphen, so it cannot be used for package revisions. For backward compatibility I would make package revision optional, so a package ending in a version sequence would be assumed to have package revision 1. When specified, it would follow the version and would allow you to keep all package revisions in the same version folder, as you seem to do now online.

One last point I want to make is that some people may perceive package revisions as a way to prepare packages for a release and use them as build numbers. That would be a particularly terrible idea. Build numbers are only relevant in preparing packaged code for a release and have nothing to do with package revisions. If this discussion goes anywhere, it might be worth calling it out in documentation.

nkolev92 commented 3 years ago

Thanks for the added detail. Note that the NuGet versioning does support a 4th number in the version due to legacy reasons :)

gh-andre commented 2 years ago

@nkolev92

Sorry, it's one of those posts you call "loaded". Safe to ignore - just reminiscing about your last 4-part version comment.

I was recently automating building some of the Nuget packages I maintain and re-explored various ways to fake package revisions. I found -Version helpful in automating the build process, but wanted to point out that it works only as a surrogate for a package revision and will most likely be abused as described below.

Having -Version was quite useful to keep the manifest file clean in terms of actual upstream software version used. For example, for zLib, the version in the manifest would remain 1.2.11, while the package would have a version 1.2.11.5, where 5 would be the package revision we discussed above. This is good.

However, I want to point out that if I take an RPM package for zLib, it will be named something like zlib-devel-1.2.11-19.fc30.i686.rpm and it's easy to see that the upstream version is 1.2.11 and the package revision is 19. A Nuget package using the 4th version component as a package revision will look like org.zlib.devel.1.2.11.5.nupkg and there's no way to tell whether 1.2.11.5 is a 4-part software version or a version combined with a package revision.

I also would bet that -Version will be abused to put a build number in the 4th component, which defies the purpose of build numbers as a sequence of builds leading to the final release build. There's a reason Semantic Versioning prohibits comparing build metadata and using -Version for build numbers will bypass this restriction.

Specifically, build numbers are not supposed to be individually addressed - they only exist to evaluate whether the latest build is good enough for a release. Being able to install a specific build, which is allowed by 4-part versions, will make testing much harder because of the number of upgrade paths. With proper build numbers, testing Nuget packages requires somebody completely uninstall the failed build or roll it back to the previously released version (not a previous build).

Anyway, still hope to see dedicated package revisions in future Nuget releases.