Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

ToolChains has hardcoded linux distribution codename #10321

Open Quuxplusone opened 12 years ago

Quuxplusone commented 12 years ago
Bugzilla Link PR11440
Status NEW
Importance P enhancement
Reported by Leo Iannacone (l3on@ubuntu.com)
Reported on 2011-11-26 18:51:58 -0800
Last modified on 2019-05-20 10:22:32 -0700
Version unspecified
Hardware PC Linux
CC andersk@mit.edu, chandlerc@gmail.com, geek4civic@gmail.com, jryans@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk, sylvestre@debian.org
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Dear developers,

I would like to discuss with you some solution to take away hardcoded codenames
in ToolChains.cpp.

For example, Debian and (in particular) Ubuntu need to patch and add by-their-
hands Distro names in the code every release.

Ubuntu needs upgrade patch every six months.

My question is: does exist another way in which ToolChains.cpp should be
"written" or "conceived" in order to simplify this process?

In this discussion Anders Kaseorg explains the problem very well:

http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-June/015545.html

Best regards,

Leo.
Quuxplusone commented 12 years ago
The ToolChains.cpp system has been completely rewritten since the thread you
cite was written. The specific complaints mentioned there have been fixed (if
not in the exact way proposed).

What is actually causing problems? Do you actually need to patch Clang today?

For reference, I have tested the current 3.0 release candidate on both Debian
and Ubuntu without *any* patches applied and I believe it found the correct GCC
installation and headers.
Quuxplusone commented 12 years ago
Looking at current trunk, it looks like something was done about the hardcoded
list of gcc versions:
http://llvm.org/viewvc/llvm-project?view=rev&revision=143874

However, nothing has been done about the hardcoded list of distros and distro
release files.  There is still

enum LinuxDistro {
  ArchLinux,
  DebianLenny,
  DebianSqueeze,
  DebianWheezy,
  Exherbo,
  RHEL4,
  RHEL5,
  RHEL6,
  Fedora13,
  Fedora14,
  Fedora15,
  FedoraRawhide,
  OpenSuse11_3,
  OpenSuse11_4,
  OpenSuse12_1,
  UbuntuHardy,
  UbuntuIntrepid,
  UbuntuJaunty,
  UbuntuKarmic,
  UbuntuLucid,
  UbuntuMaverick,
  UbuntuNatty,
  UbuntuOneiric,
  UnknownDistro
};

along with code to parse /etc/lsb-release, /etc/redhat-release,
/etc/debian_version, /etc/SuSE-release, /etc/exherbo-release, and /etc/arch-
release to detect which of these distros is running.  This code automatically
breaks every six months when Ubuntu and Fedora both release another version.
(For example, Ubuntu has a new Precise Pangolin development release, and Fedora
released Fedora 16 as stable.  Even Debian squeeze released, so its
/etc/debian_version looks like ‘6.0.3’ instead of ‘squeeze/sid’ now and
DetectLinuxDistro() misdetects it.)

Maybe a stopgap solution would be to treat unknown Ubuntu releases as
UbuntuOneiric, unknown Fedora releases as FedoraRawhide, etc. instead of
categorizing all distros from the future as UnknownDistro.  But I think the
entire idea of looking at /etc/*-release files to guess the right linker flags
is a violation of abstraction.
Quuxplusone commented 12 years ago
(In reply to comment #2)
> Looking at current trunk, it looks like something was done about the hardcoded
> list of gcc versions:
> http://llvm.org/viewvc/llvm-project?view=rev&revision=143874
>
> However, nothing has been done about the hardcoded list of distros and distro
> release files.  There is still
>
> enum LinuxDistro {
>   ArchLinux,
>   DebianLenny,
>   DebianSqueeze,
>   DebianWheezy,
>   Exherbo,
>   RHEL4,
>   RHEL5,
>   RHEL6,
>   Fedora13,
>   Fedora14,
>   Fedora15,
>   FedoraRawhide,
>   OpenSuse11_3,
>   OpenSuse11_4,
>   OpenSuse12_1,
>   UbuntuHardy,
>   UbuntuIntrepid,
>   UbuntuJaunty,
>   UbuntuKarmic,
>   UbuntuLucid,
>   UbuntuMaverick,
>   UbuntuNatty,
>   UbuntuOneiric,
>   UnknownDistro
> };
>
> along with code to parse /etc/lsb-release, /etc/redhat-release,
> /etc/debian_version, /etc/SuSE-release, /etc/exherbo-release, and
> /etc/arch-release to detect which of these distros is running.

Much of this code is dead. We don't use the Linux distro detection for much now
other than a few fairly random bits, and I had thought all of those only cared
about distinguishing between suse, ubuntu, debian, and everything-else.

I'm going to continue cleaning this up as time permits, but what would help me
is if you listed specific actions with Clang which break because of this.
Unless we have something specific that breaks because of this, I don't think
keeping a PR open about the design cruft is all that useful...

> Maybe a stopgap solution would be to treat unknown Ubuntu releases as
> UbuntuOneiric, unknown Fedora releases as FedoraRawhide, etc. instead of
> categorizing all distros from the future as UnknownDistro.  But I think the
> entire idea of looking at /etc/*-release files to guess the right linker flags
> is a violation of abstraction.

No one is really arguing that it is a good abstraction, but we need to know the
concrete things it breaks. It does actually fix things. My personal goal is
that for large, common Linux distributions no patch or configuration step is
necessary to build a Clang which can compile and link binaries on that system.
Doing this is not easy given the peculiarities of several distributions'
packaging practices (I'm looking at you Debian/Ubuntu and your multiarch
madness). That said, we should key behavior on the actual detected toolchain of
the system, not on the /etc/...-release files.
Quuxplusone commented 12 years ago
As the Debian maintainer of clang, even with the version 3.0, I had to patch it
to make sure clang finds all headers...
However, my patches are Debian/Ubuntu oriented/hardcoded and cannot be ported
easily into the main distribution.