samtools / htslib

C library for high-throughput sequencing data formats
Other
799 stars 445 forks source link

Install the shared library .so file as executable #1525

Closed StephDC closed 1 year ago

StephDC commented 1 year ago

Executable programs linked to non-executable .so could result in a core dump for attempting to execute codes that are not marked as executable in some systems such as HP_UX.

It is also standard practice of installing such shared objects with +x bit set (755) in RHEL-based distributions - not doing so would get caught by the packaging QA system (non-executable file found in /usr/lib) and rejected. This is currently fixed manually on fedora's building script (https://src.fedoraproject.org/rpms/htslib/blob/rawhide/f/htslib.spec#_79), though I think providing the .so file with +x set in the first place would be better.

One may argue that it offers little benefit to Debian-based distributions as they do allow non-executable .so files, and distributions such as Gentoo would automatically +x all .so installed into /usr/lib64 in the installation step, but making it as permissive as when the .so is produced (by marking codes that were supposed to be executed, although not directly, as executable instead of removing the +x bit in make install) would rarely cause any harm on them.

jkbonfield commented 1 year ago

I don't like the blanket approach of marking it as executable on all environments. A quick check on our ubuntu servers:

$ find /usr/lib -name '*.so' -type f -ls|awk '{a[$3]++} END {for (i in a) {print i,a[i]}}'
-rwxr-xr-x 3
-rw-r--r-- 2567

Executable libraries would be deemed as wrong here as non-executable elsewhere, and likely lead to similar tweaks in the distribution package to chmod them back again.

I wonder if there is a way we can determine the correct policy at configure time. Even if it's just a uname / lsb_release type command to identify the OS.

jmarshall commented 1 year ago

I was going to say you could invent an INSTALL_LIB to make it easier for packagers to override on the make install line, but I see there already is one…

daviesrob commented 1 year ago

This stack exchange question discusses the library permissions difference between Debian and Red Hat based systems. I think I'm more on the Debian side here, given that trying to run libhts.so results in an instant core dump.

Unfortunately I don't think there's an easy way of probing to see if a platform desires executable shared libraries. We could add a configure option to make it easier to choose. Or possibly we could add an entry point so that it does something more useful than dump core, a bit like when you try to run libc.so.6.

jkbonfield commented 1 year ago

Making code changes to permit it to become an executable is tail wagging the dog.

I'd just be in favour of adding a Makefile LIB_PERM definition so it's trivial for distributions to do minimal work to install.

As John points out, right now you could just use make install INSTALL_LIB="install -m 755" so perhaps this comes down to a comment in the INSTALL file detailing instructions for HP-UX. Maybe make install LIB_PERM=755 is a little easier (with it defaulting to 644), but it's a small difference. The main thing is the documentation in INSTALL I think.

Edit, also I wasn't saying we need configure to do things like my find command and try to statistically work out what the standard policy is for this system. Rather something naive like "uname -sr" and look for hpux is probably sufficient. There's no requirement for execute bit on redhat, so whether or not we do it is largely irrelevant there.

StephDC commented 1 year ago

My reasoning behind marking it as +x by default is that it avoids usability issue for those running an OS that requires ALL executable code, including those inside a .so file and never supposed to be executed directly, being marked as +x. Although most Linux do not require it, RHEL family recommends it, and HP-UX (and potentially some other Unix) do have such requirement.

The Debian's reasoning of marking them as non-executable to avoid the user running them in the first place to avoid a core dump when the user tries to execute a file that is not supposed to be executed directly makes sense in the packager's POV as it reduces number of bug reports related to it, but it is eventually the user's fault of executing a shared library instead of the executable. And it is the user that needs more education on "do not execute random files".

However it becomes a usability issue when it comes to those on an OS that separates executable code and non-executable code when mapped to the memory based on the +x permission on the filesystem. When a user make install this package, then figured out it core dumps even following the correct way to use it (e.g. through samtools and other commands), then that likely shifts the whole problem to the side of the author of the code.

For a software packager's point of view, they kind of have an idea of what permission is expected for what file in what directories, so they would apply such tweaks with ease. However for an average user with little knowledge on the permission issue, their expectation would likely be:

  1. Follow the INSTALL file's instruction to autoreconf -i, ./configure, make, make install
    • probably with the little tweak of changing PREFIX to somewhere in their home directory (for those working on a shared system with no root privilege)
  2. And the program runs.

The lack of +x by default for .so file fails the users on OSes that require +x for all executable code for such expectation. On the other hand, if the INSTALL_LIB defaults the permission to 755, on whatever system the user tries to run it, even those who wish to see a .so file without +x, no error would occur and the user runs the program just fine.

And yes, making the .so file come with a __init entry point like libc for some version information is the tail wagging the dog. Don't do that unless there is some compelling issue such as for some reason the majority of user start trying to execute whatever file they found with +x permission set.

In this point, I feel that one of the two following way of solving this problem would work better:

  1. Make .so installed as 755 as default and let the Debian packagers, and those who don't want +x on .so files manually -x, while guarantee those on whatever system can make install and have a working software without knowing anything else
  2. Improve the documentation to tell those users that in case of a core dump, check if their system requires +x on .so file, and if yes, manually +x those .so files.

Any further ideas?

StephDC commented 1 year ago

Unfortunately I don't think there's an easy way of probing to see if a platform desires executable shared libraries.

I think there is an easy way to test if the platform REQUIRES executable shared libraries though. In ./configure, compile a shared library .so with an application, then -x the .so file, then try to run the application. If it segfault core dumps, we are on a system that require +x .so file, therefore we shall set the INSTALL_LIB to INSTALL instead of INSTALL_DATA.

However I am not so familiar with autotools and haven't come up with the code implementing such testing. Does anyone think that such test would be helpful?

jkbonfield commented 1 year ago

We don't wish to change the default behaviour for operating systems that have conventions on other permissions. While we could attempt some automatic detection, we deemed it as unnecessary effort for a problem with an easy work around already.

Instead we simplify the process somewhat with a new LIB_PERM variable that can be overridden at install time.