python-distro / distro

A much more elaborate replacement for removed Python's `platform.linux_distribution()` method
https://distro.readthedocs.io/
Apache License 2.0
265 stars 66 forks source link

Add Debian Testing to the tests #356

Closed harshula closed 1 year ago

harshula commented 1 year ago

Debian Testing's version metadata is a different format to the Debian Release version metadata. I hope having this in your test suite will be helpful.

I found that function version() in v1.6.0 of distro was returning:

(Pdb) p versions
['', 'n/a', '', '', '', '']

in https://github.com/spack/spack/ v0.19.1. Spack was then taking the first element of "n/a" (i.e. "n") and using it at as the version!

HorlogeSkynet commented 1 year ago

Hi @harshula ! Thanks for your PR.

Could the lsb_release script you add be simpler (as the ones already present among other distros tests, i.e. Debian 8) ? 🙂

hartwork commented 1 year ago

@harshula I would like to second what @HorlogeSkynet is asking. We need a test dummy here, not a real lsb_release implementation, unfortunately.

harshula commented 1 year ago

Hi @HorlogeSkynet & @hartwork Apologies, I've pushed the commit again with the correction.

HorlogeSkynet commented 1 year ago

Thanks for the update @harshula !

I wonder whether we should test/fix the n/a case (see OP) here too. WDYT @hartwork ? 🙂


EDIT : maybe something like (pretty naive) :

         if best:
             # This algorithm uses the last version in priority order that has
             # the best precision. If the versions are not in conflict, that
             # does not matter; otherwise, using the last one instead of the
             # first one might be considered a surprise.
             for v in versions:
-             if v.count(".") > version.count(".") or version == "":
+             if v.count(".") > version.count(".") or version.casefold() in ("", "n/a"):
                     version = v
         else:
             for v in versions:
-                 if v != "":
+                 if v.casefold() not in ("", "n/a"):
                     version = v
                     break
harshula commented 1 year ago

Hi @HorlogeSkynet, Your patch results in:

(Pdb) p distname, version
('debian', 'bookworm/sid')

Perhaps, consider having a filter/mapping function that stops characters like "/", ";", etc being returned?

Without the patch:

(Pdb) p distname, version
('debian', 'n/a')

Thanks!

HorlogeSkynet commented 1 year ago

Perhaps, consider having a filter/mapping function that stops characters like "/", ";", etc being returned?

I get your point but I fear this would end up in a "cat and mouse" issue... N/A literal is pretty explicit though. Let's wait for @hartwork inputs on this 😉

hartwork commented 1 year ago

@HorlogeSkynet @harshula given that for Debian bullseye version is "11"â€Ķ

# lsb_release -a 2>/dev/null | grep Release
Release:        11

# distro -j | grep -w version
    "version": "11",

â€ĶI believe that "n/a" is better here than "bookworm" or "bookworm/sid" because that is the codename of the (major) version, not a version. If master is serving n/a for a version then maybe that's good and users can check for that value to know they have sid if Debian. What do you think?

harshula commented 1 year ago

Hi @hartwork , Debian Testing does not have an integer version. The most accurate way to describe the current Debian Testing is that it is a mixture of the next release ("bookworm") and unstable ("sid"). Therefore, "bookworm/sid" is the most accurate option.

Once Debian 12 (bookworm) is released, Debian Testing is most accurately described as "trixie/sid".

hartwork commented 1 year ago

@harshula I didn't think of testing earlier but I'm getting identical cat /etc/os-release output from Debian containers for testing and unstable now. I guess that means that testing and unstable cannot be told apart without looking into /etc/apt/sources.list*?

harshula commented 1 year ago

Hi @hartwork , Debian's behaviour, w.r.t. Testing and Unstable, apparently changed around September 2022. Prior to that, my understanding is that distro.py was returning "testing" on Debian Testing. I'm now trying to understand what happened.

harshula commented 1 year ago

Hi @hartwork, The previous lsb_release (https://sources.debian.org/src/lsb/11.1.0/lsb_release/) written in Python outputs:

$ ./lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux bookworm/sid
Release:    testing
Codename:   bookworm
harshula commented 1 year ago

Background: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1008735

harshula commented 1 year ago

I guess that means that testing and unstable cannot be told apart without looking into /etc/apt/sources.list*?

There's a guess_release_from_apt() function in the old version.

hartwork commented 1 year ago

@harshula I think it's worth pointing out that both the old and the new version have two rather similar Python scripts:

Old

New

The call to guess_release_from_apt seems to be in both versions but only on the .py file side of things.

However that's not the code used by the lsb_release command on Debian, that would be package lsb-release rather than lsb. The lsb version says Release: testing but lsb_release says Release: n/a. To reproduce inside docker run --rm -it debian:testing bash -i:

# apt-get update && apt-get install --yes wget ca-certificates python3 distro-info-data
# wget https://sources.debian.org/data/main/l/lsb/11.1.0/lsb_release
# wget https://sources.debian.org/data/main/l/lsb/11.1.0/lsb_release.py
# python3 ./lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux bookworm/sid
Release:        testing
Codename:       bookworm
# python3 ./lsb_release.py
{'ID': 'Debian', 'OS': 'GNU/Linux', 'DESCRIPTION': 'Debian GNU/Linux bookworm/sid', 'RELEASE': 'testing', 'CODENAME': 'bookworm'}
[]
# rm lsb_release lsb_release.py
# wget https://sources.debian.org/data/main/l/lsb/11.6/lsb_release.py
# wget https://sources.debian.org/data/main/l/lsb/11.6/lsb_release
# python3 ./lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux bookworm/sid
Release:        testing
Codename:       bookworm
# python3 ./lsb_release.py
{'ID': 'Debian', 'OS': 'GNU/Linux', 'DESCRIPTION': 'Debian GNU/Linux bookworm/sid', 'RELEASE': 'testing', 'CODENAME': 'bookworm'}
[]
# rm lsb_release lsb_release.py
# wget https://salsa.debian.org/gioele/lsb-release-minimal/-/raw/master/lsb_release
# bash ./lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux bookworm/sid
Release:        n/a
Codename:       bookworm

What do you think?

harshula commented 1 year ago

I think it's worth pointing out that both the old and the new version have two rather similar Python scripts:

Hi @hartwork , No, the new version is a shell script: https://sources.debian.org/src/lsb-release-minimal/12.0-1/lsb_release/ . The shell script does not attempt to infer from apt.

hartwork commented 1 year ago

@harshula that matches what I said above. We're on the same page then :smiley:

harshula commented 1 year ago

What do you think?

What's your specific question?

hartwork commented 1 year ago

@harshula I was trying to ask for suggestions, for your view, for things I maybe was misunderstanding or missing. My own current view is that this may need a fix in Debian not distro. What is your view?

harshula commented 1 year ago

Hi @hartwork,

  1. This PR was opened to add Debian Testing metadata to the test suite.

  2. Function version() in distro.py v1.8.0 infers the following:

    (Pdb) p versions
    ['', 'n/a', '', '', '', '', 'bookworm/sid']

        Then it chooses "n/a" over "bookworm/sid". That is a bug in distro.py, in my opinion.

  3. It appears Debian has introduced regressions when it moved from the old Python-based lsb_release script to the new Shell-based lsb_release script.

hartwork commented 1 year ago

Then it chooses "n/a" over "bookworm/sid". That is a bug in distro.py, in my opinion.

@harshula I think that's where I'm not in agreement so far. Why is it a bug in distro if lsb_release -a also says "n/a"?

harshula commented 1 year ago

Why is it a bug in distro if lsb_release -a also says "n/a"?

@hartwork, distro.py has heuristics to collect a set of possible version values:

        versions = [
            self.os_release_attr("version_id"),
            self.lsb_release_attr("release"),
            self.distro_release_attr("version_id"),
            self._parse_distro_release_content(self.os_release_attr("pretty_name")).get(
                "version_id", ""
            ),
            self._parse_distro_release_content(
                self.lsb_release_attr("description")
            ).get("version_id", ""),
            self.uname_attr("release"),
        ]
...
        elif self.id() == "debian" or "debian" in self.like().split():
            # On Debian-like, add debian_version file content to candidates list.
            versions.append(self._debian_version)

The reason to use this technique is to NOT rely on a single method.

In this discussion, you appear to be giving undue weight to one single method, lsb_release, and ignoring another method that provides a more accurate result. Why?

harshula commented 1 year ago

Hi @hartwork , More generally, libraries should be clear about expected return values. Ideally, a library should protect users by preventing unexpected return values. I suspect a user of this library would not be expecting "n/a" has a valid return value.

hartwork commented 1 year ago

@harshula my view is this:

What do you think?

harshula commented 1 year ago

Hi @hartwork,

  • If lsb_release -a doesn't contain the word "testing" and nothing better than "n/a" then distro should not say "testing" and it's the job of lsb_release to do better for Debian in general.

'Strawman' argument. No one requested this.

What do you think?

I should have opened this PR and not spent time engaging in the tangential discussion, particularly when the relevant points are simply ignored: https://github.com/python-distro/distro/pull/356#issuecomment-1467286630 https://github.com/python-distro/distro/pull/356#issuecomment-1467288895

Let me know if you need anything for the actual PR.

HorlogeSkynet commented 1 year ago

👍 for merging this PR as-is. The idea was not to start a debate about an upstream behavior change. Maybe we could :

1) copy the different pointers included here in a separate issue dedicated to the subject 2) join an upstream discussion on the Debian's bug tracker

Bye 👋

hartwork commented 1 year ago

@harshula alright, this is not going well. I'm not in with fixing the wrong thing just because it's easy. @HorlogeSkynet discussing upstream behavior is exactly what this needs. And discussing @harshula's recent tone. I'm out, @HorlogeSkynet go ahead as you please.

HorlogeSkynet commented 1 year ago

To continue (close ?) the discussion about n/a, I read the Debian bug (until now I hadn't, sorry), and it's definitely an upstream regression. Actually, I even experienced it some months ago but quickly worked-around that as, eventually, testing and sid are unstable.

For the following reasons, I am with @hartwork on this issue :

My two cents.

Thanks both of you for your time, see you 👋

harshula commented 1 year ago

Hi @HorlogeSkynet ,

Thanks for merging the test case! I appreciate that you were trying to be helpful when you started the tangential discussion. There are reasons why I didn't open an Issue to work-around the Debian Regression.

  • Despite n/a is explicit (as I said earlier), I don't think this would be a good idea to maintain a list of "known tokens" (what if other rolling distributions end up using unset, unknown, TBD, or whatever else). I hope this does not seem a "strawman argument", but you will get my point ;

Unfortunately, that's what libraries sometimes have to do to protect their users. No, that is not a "strawman argument". The other participant in the PR created a fictitious proposition, i.e. returning "testing", and then argued against it. That is considered very poor etiquette.

  • "power to the user" ?

That explains the philosophical chasm! My approach would be to avoid the user shooting themselves in the foot. :wink: If I have time, I'll open an issue on the tangential discussion.

Thanks! :wave:

hartwork commented 1 year ago

Hi @HorlogeSkynet ,

Thanks for merging the test case! I appreciate that you were trying to be helpful when you started the tangential discussion. There are reasons why I didn't open an Issue to work-around the Debian Regression.

  • Despite n/a is explicit (as I said earlier), I don't think this would be a good idea to maintain a list of "known tokens" (what if other rolling distributions end up using unset, unknown, TBD, or whatever else). I hope this does not seem a "strawman argument", but you will get my point ;

Unfortunately, that's what libraries sometimes have to do to protect their users. No, that is not a "strawman argument". The other participant in the PR created a fictitious proposition, i.e. returning "testing", and then argued against it. That is considered very poor etiquette.

  • "power to the user" ?

That explains the philosophical chasm! My approach would be to avoid the user shooting themselves in the foot. wink If I have time, I'll open an issue on the tangential discussion.

Thanks! wave

Just to have this on record if it gets edited.