openSUSE / cpanspec

Create openSUSE spec files form cpan tar files
Other
8 stars 10 forks source link

Transform / Normalize perl module versions to rpm-like versions #47

Open perlpunk opened 1 year ago

perlpunk commented 1 year ago

Motivation

Perl modules usually have decimal versions. We just take the version from the module literally and put it into the rpm spec file. That can lead to wrong behaviour. New releases suddenly have lower versions (if you ask rpm); Requires: lines can lead to the wrong result.

For details see below.

I now decided to give that problem another shot.

Migration plan

My idea was to collect all module versions in d:l:p at a given time, and save that in a file in the cpanpec distribution. Let's call it "dlp.tsv". That file would never change.

Everyone generating a spec file for a perl module in d:l:p must then use this file when calling cpanspec.

cpanspec will use an algorithm which you can see in the following gist in next-version.pl:

https://gist.github.com/perlpunk/9a40bfde89e55685f6358dfb76fc3961

e.g.

perl next-version.pl perl-YAML-PP 0.037
new version: 0.37.0
perl next-version.pl perl-YAML-PP 0.035
new version: 0.035

So for all versions that are lower or equal to the saved version, we use the old format which we directly get from CPAN. For versions newer than the saved one we normalize it to an rpm version. Only for versions that currently have more than 3 decimals it gets a bit more complicated, e.g. 1.2023. For those we keep the old format until the major version eventually increases. Otherwise the next version would be 1.202.4, which would falsely be lower.

I will prepare cpanspec for this and would cleanup devel:languages:perl:autoupdate at some point and generate the dlp.tsv, and release a new cpanspec version.

Now it would be cool if some of you have time to take a look at the code, maybe play with it and see if it makes sense. We would have to keep that tsv file practically forever, but hopefully never have to deal with manually fiddling with modules that changed the number of decimals.

Background

Perl versions work differently than rpm versions.

In Perl, a version is simply a decimal number. Examples of versions and how they would (and should) be consistently translated as rpm versions: (You can do that with version->parse($cpan_version)->normal)

3.14    -> 3.140.0
3.140   -> 3.140.0
3.014   -> 3.14.0
3.001   -> 3.1.0
3.14159 -> 3.141.590

What we currently do when generating a spec file with cpanspec: We just take the CPAN version literally. So this is what we put into the spec file, and on the right side you see how rpm understands the version:

3.14    -> 3.14.0
3.140   -> 3.140.0
3.014   -> 3.14.0
3.001   -> 3.1.0
3.14159 -> 3.14159.0

One typical situation is, the current version of a module is 3.19, and the next version is then 3.2, because mathematically, that's perfectly correct.

rpm reads that as 3.19.0 and 3.2.0, so the new version would be lower.

Another module having Requires: Foo > 3.19 would now be unresolvable.

Until now, we just manually added a zero, e.g. 3.2 -> 3.20, to make it work, and in the next version the module would use two decimals again anyway. But that's not always the case. For some modules, we added hardcoded (!) eceptions into the cpanspec script.

Now, if we correctly translate the versions, 3.19 would become 3.190.0, and 3.2 would become 3.200.0.

As in my plan, only new versions starting with a specific date would be translated into the correct format, existing requirements would still work, e.g. the existing Requires: Foo > 3.19 would stay like that to ensure that requirements would be satisfied for repositories where the newer version 3.200.0 has not yet landed.

This process should work for all versions having up to 3 decimals. This is the majority in d:l:p, currently 2702 modules.

For modules having more than 3 decimals, like 3.14159, the translation would make newer versions lower, so we have to keep the old decimal format. Until the module eventually increases its major version. We have currently 316 modules in this category.

Then we have 20 modules that use integers. We can think about just leaving them as they are, because they can't be misinterpreted. A typical example: Perl::Tidy: 20230309 And then we have 125 modules that are already using versions with more than one dot, so we can just normalize them (e.g. 3.014.015 -> 3.14.15). They might have been using decimal versions in the past, so for older versions we will do the same as for all other modules - use the given CPAN version format.

It is a bit of work, but if I'm correct we don't have to intervene manually when a module changes its amount of decimals.

perlpunk commented 1 year ago

I'm currently preparing a new cpanspec version that takes a new commandline parameter --dlp, which will then load the shared file version-snapshot/opensuse-dlp.tsv (either from the git checkout or from /usr/share/cpanspec). My data munging I'm currently doing in this branch: https://github.com/perlpunk/cpanspec/tree/version-fun This includes exports of 02packages.details.txt from CPAN and using it to determine all current module versions in dlp.

When this is active, every submission/update for devel:languages:perl should use the --dlp switch. I'm also thinking about adding a general option --versionformat (rpm|cpan) for people using cpanspec outside of dlp.

Here's the updated cpanspec using the new --dlp switch: https://build.opensuse.org/package/show/home:tinita:branches:devel:languages:perl/cpanspec

perlpunk commented 1 year ago

I made some submit requests from https://build.opensuse.org/project/show/devel:languages:perl:autoupdate today, but there are a few hopefully simple problems to be solved.