ocaml / opam

opam is a source-based package manager. It supports multiple simultaneous compiler installations, flexible package constraints, and a Git-friendly development workflow.
https://opam.ocaml.org
Other
1.25k stars 363 forks source link

depexts: Find the optimal scheme for conf-* packages for maxium coverage and non-overlaping #4440

Open kit-ty-kate opened 4 years ago

kit-ty-kate commented 4 years ago

Current state of things in opam-repository

Currently depexts in opam-repository (mostly inside conf-* packages) typically look like this:

depexts: [
  ["libzstd-dev"] {os-family = "debian"}
  ["libzstd-devel"] {os-distribution = "centos"}
  ["libzstd-devel"] {os-distribution = "rhel"}
  ["libzstd-devel"] {os-distribution = "fedora"}
  ["libzstd-devel"] {os-family = "suse"}
  ["zstd-dev"] {os-distribution = "alpine"}
  ["zstd"] {os-distribution = "homebrew" & os = "macos"}
  ["zstd"] {os = "freebsd"}
]

This is problematic if we want to support more than just some specific distributions but also their "derived distributions" (e.g. Manjaro for Archlinux, PopOS for Ubuntu, …)

I've tried using os-family instead of os-distribution for all of them, as it is what is used in opam internally anyway to detect and launch the commands but for selecting package some issues arise from that approach, mainly: 1) Some of the main distributions are derived from each others. For instance debian and ubuntu

  ["libmysqlclient-dev"] {os-distribution = "ubuntu"}
  ["default-libmysqlclient-dev"] {os-distribution = "debian"}

In this example requested packages have a different name. However if we were to use os-family here it would result in the right behaviour on ubuntu-derived distribution, but not on the main ubuntu as the os-family variable would pick up debian. The same thing happens with centos being derived from rhel but the set of packages can vary wildly (especially given the sub-repositories structure – e.g. EPEL, PowerTools, … – in CentOS) 2) To complicate things even further, some distributions might have different names. From https://github.com/ocaml/opam/blob/3ee922d3e621bd0ee7427f33196bd977009131ef/src/state/opamSysInteract.ml for instance, Archlinux can be called arch or archlinux, Oraclelinux can be called ol or oraclelinux, OpenSUSE can be called suse or opensuse (I remember the change of the last one in OpenSUSE 15.x or whatever, for sure)

These things can be encoded using disjunction such as:

  ["libmysqlclient-dev"] {os-distribution = "ubuntu" | os-family = "ubuntu"}
  ["default-libmysqlclient-dev"] {os-distribution != "ubuntu" & os-family = "debian"}
  ["libzstd-devel"] {os-family = "suse" | os-family = "opensuse"}

However this is getting quite hard to read and maintain with all this.

Does anyone have any idea how to get the clean and easy scheme we already now have but make it work for derived distributions and be maintainable? Maybe by adding a new variable unique to a distribution and its derivatives? Thoughts?

cc @rjbou

dra27 commented 4 years ago

Current related PR: https://github.com/ocaml/opam/pull/4441/files and some prior issues related to this: https://github.com/ocaml/opam-repository/issues/15044, https://github.com/ocaml/opam-repository/issues/14906

dra27 commented 4 years ago

The problem seems to be that take os-family and os-distribution describe a tree, but the values are treated as though there's only one level of derivation. Looking at Mint 20, Ubuntu 20.04 and Debian 10 (I couldn't be bothered to install Debian 11...), we have:

PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian

NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

NAME="Linux Mint"
VERSION="20 (Ulyana)"
ID=linuxmint
ID_LIKE=ubuntu
PRETTY_NAME="Linux Mint 20"
VERSION_ID="20"
VERSION_CODENAME=ulyana
UBUNTU_CODENAME=focal

(see also docs /etc/os-release)

dra27 commented 4 years ago

As a slight tweak on your suggestion of a distribution-specific moniker, I think what's wanted is a variable whose value describes a package management namespace? So you'd have os-packages = "ubuntu" for both Mint (assuming it largely uses Ubuntu package names?) and Ubuntu, os-packages = "debian" for Debian, and os-packages = "fedora" for RHEL/CentOS/Fedora/OpenSUSE?

rjbou commented 4 years ago

related (linux mint & opensuse) #3692

kit-ty-kate commented 4 years ago

As a slight tweak on your suggestion of a distribution-specific moniker, I think what's wanted is a variable whose value describes a package management namespace? So you'd have os-packages = "ubuntu" for both Mint (assuming it largely uses Ubuntu package names?) and Ubuntu, os-packages = "debian" for Debian, and os-packages = "fedora" for RHEL/CentOS/Fedora/OpenSUSE?

Something like that yeah, except RHEL, CentOS, Fedora and OpenSUSE need to be separate as they have different sets of packages (a bit like the distinction between Ubuntu and Debian)

kit-ty-kate commented 4 years ago

if we want to add a new variable os-packages-set = "ubuntu" or something like that might be a better choice of name

dra27 commented 4 years ago

Something like that yeah, except RHEL, CentOS, Fedora and OpenSUSE need to be separate as they have different sets of packages (a bit like the distinction between Ubuntu and Debian)

Actually, it'd be interesting (at some point) to get a handle on just how many packages are different across those in opam-repository - the example above has libzstd-devel for them all and I vaguely feel like I've seen that in a lot of conf packages? The only interest from a depext perspective is the names... indeed, it makes me wonder if we'd be better off exposing the package manager (which is inferred internally already) and using the os-distribution, os-family, etc. as exceptions to that. For example:

depexts: [
  ["libzstd-dev"] {os-package-manager = "apt"}
  ["libzstd-devel"] {os-package-manager = "dnf"}
  ["zstd-dev"] {os-packager-manage = "pacman"}
  ["zstd"] {os-package-manager = "homebrew"} ## & os ="macos" ??
  ["zstd"] {os-package-manager = "pkg"} ## & os = "freebsd" ??
]

and possibly also:

  ["libmysqlclient-dev"] {os-package-manager = "apt" & os-distribution != "debian"}
  ["default-libmysqlclient-dev"] {os-package-manager = "apt" & os-distribution != "debian"}

But the script to analyse that will take me more time than the margin of the comment permits :wink:

dra27 commented 4 years ago

if we want to add a new variable os-packages-set = "ubuntu" or something like that might be a better choice of name

Indeed, although perhaps os-package-set, rather than packages

kit-ty-kate commented 4 years ago

I don't think os-package-manager is any good in this context. Yes, "most" of the packages have similar names when they use the same package manager. But maybe like 10-15% of times it is not the case. Also for some of them they are already packaged in the base distribution or something like that and you don't want to pick the one from the packages. Also some package selection depend on on the version of the distribution, so you want the distribution&derivatives name to match with os-version. e.g.:

  ["libmysqlclient-dev"] {os-package-set = "ubuntu" & os-version >= "18.04"}

My point is that for maintenance purpose it is much better to rely on the package set rather than the family (too clumsy), distribution (too many cases) or package manager (too restrictive)

dbuenzli commented 4 years ago

Maybe os-package-repository ? The thing I don't understand is how you are going to derive its value from /etc/os-release, will you hardcode "linuxmint = ubuntu" in opam ?

dbuenzli commented 4 years ago

Also from a more general perspective the whole depexts mecanism seems to be a maintenance chore. May be a good time to ponder again @ygrek's nice idea in https://github.com/ocaml/opam/issues/3140

kit-ty-kate commented 4 years ago

Maybe os-package-repository ?

why not.

The thing I don't understand is how you are going to derive its value from /etc/os-release, will you hardcode "linuxmint = ubuntu" in opam ?

os-package-repository would be a very simple variable defined the following way (or equivalent):

let resolve_alias = function
  | ("ubuntu" | "debian" | "fedora" | "rhel" | "centos" | "amzn" | "ol" | "arch" |
     "mageia" | "suse" | "alpine" | "gentoo" | "homebrew" | "macports" |
     "freebsd" | "openbsd" | "netbsd" | "dragonfly") as x -> Some x
  | "archlinux" -> Some "arch"
  | "opensuse" -> Some "suse"
  | "oraclelinux" -> Some "ol"
  | _ -> None

let os_package_repository =
  match resolve_alias os_distribution with
  | Some x -> x
  | None ->
      match resolve_alias os_family with
      | Some x -> x
      | None ->
          Printf.ksprintf failwith
            "External dependency handling not supported for OS family '%s'."
            os_family
smondet commented 3 years ago

In case it helps, more data-points / distros where depext does not find the packages to install (because they are referenced under os-family=debian or os-distribution=arch):

Manjaro:

 $ opam depext --flags
# Depexts vars detected on this system: arch=x86_64, os=linux, os-distribution=manjaro, os-family=arch

Pop'OS (System76's distro):

 $ opam depext --flags
# Depexts vars detected on this system: arch=x86_64, os=linux, os-distribution=pop, os-family=ubuntu