easybuilders / easybuild-framework

EasyBuild is a software installation framework in Python that allows you to install software in a structured and robust way.
https://easybuild.io
GNU General Public License v2.0
148 stars 201 forks source link

provide a more portable way of specifying os dependencies #174

Closed boegel closed 6 years ago

boegel commented 12 years ago

We need a more portable way to specify OS dependencies, since package names can differ between OSs (e.g., boost-devel on RedHat vs boost-dev on Debian).

Support for running commands to check for specific functionality is already there, but is only used as a fallback (i.e. when a package that matches isn't found).

We should allow specifying that we want a particular header file, binary, version of a particular command, etc. Checking for header files may result in reproducing a tool like configure, which actually tries to compile a small C program to check whether a header file is found.

Noted by @stdweird: we should rely on the package manager to figure out whether something we need is available (rpm/yum, dpkg, ...)/

Another option (suggested by @fgeorgatos) may be meta-modules. These could correspond to package names, or policies (e.g. http://eniac.cyi.ac.cy/display/UserDoc/HPC+Baseline+Configuration or http://www.ccac.hpc.mil/consolidated/bc/policy.php). Example: tcsh can be provided via LS2_05-06 or FY05-06; names are fictional they can be anything; they could also correspond to rpm/deb packages.

fgeorgatos commented 12 years ago

Replying to @stdweird: "we should rely on the package manager".

That approach may be correct, assuming that there is a package manager. I see no reason to exclude easybuild functionality to, say, slackware or freebsd or... ...macosx?!

From my side I'd like to see some logic of the style: If any of "tcsh", "LS2_05-06", "FY05-06", "HPCBIOS_05-06" or "OS_DEPENDENCIES_IGNORE" modules are defined => assume OK. So this would be a way to satisfy the stated dependency checks.

Interestingly, such approach has such extra advantages and caveats:

This last idea makes me think that the namespace produced by the following oneliner is a good 1st consideration:

rpm -qa|xargs -n1 rpm -q --queryformat '%{NAME}/%{VERSION}/%{ARCH}/%{RELEASE}\n' # 4 RHEL family

These could all be auto-produced modulefiles in a special directory to be used by EB. We can agree on special conventions to satisfy most unix-like environments and more. (though, admittedly, the meaning of eg. boost-devel may not be exactly the same across variations..).

enjoy, Fotis

fgeorgatos commented 12 years ago

Also, the following tool in this area seems to exist to solve this problem: http://en.wikipedia.org/wiki/Pkg-config (outsourcing the issue to another busy community is an attractive idea ;-)

the meaning of eg. boost-devel might not be exactly the same across variations

ie. let's learn from people trying to give consistent names to an inconsistent moving target: http://stackoverflow.com/questions/3971703/how-to-use-c-boost-library-with-pkg-config

fgeorgatos commented 11 years ago

So, the scope of pkg-config is kind of limited to libraries, used for linking etc.

I have been looking a bit more into this and it seems there is no obvious way out, yet.

Occasionally, I've passing via solutions which can somewhat integrate linux distros but, not exactly MacOSX: http://labix.org/smart http://en.wikipedia.org/wiki/PackageKit

Generally, MacOSX gets tricky because the userbase has been split in the past between fink & macports; while now it seems to be split between macports & homebrew. There are also other complications eg. like that gcc used to arrive via Xcode until v4.2, but now Xcode delivers llvm-gcc and you guess the familiar software "horror" stories. In short, trouble: there is no generic package-oriented solution for mac users, AFAIK. Some have gone the way of producing custom packages: http://www.openpkg.org (!)

btw. to allow to grasp the complexity of this problem, you'd like to see this:

over the years I have accumulated a bunch of tools which seem to reoccur as dependencies when we build (scientific) software: http://hpcbios.readthedocs.org/en/latest/HPCBIOS_2012-90.html It's a draft: there are items that have to be added in that list and others that must be removed (eg. I'd rather have openmpi come via easybuild/modules instead of OS).

So far, the meta-modules that correspond to (hpcbios?) policies make sense to me.

to be continued, (perhaps by the end of the month in a face to face discussion?)

Fotis

ps. Note that most packages mentioned in #296 are generally already in that list.

pforai commented 11 years ago

We discussed this today in the channel with boegel. From my POV there are three approaches to pursue

  1. Ignore OS dependencies completely, describe that everything need at compile and/or runtime for any package needs to come from within the EB framework - ie specify and provide all deps as configs. This has the negative side effect that can be observed in certain cases if very low level packages are required and subtle bugs are caused by LD_LIBRARY_PATH then replacing these low level libraries from EB compiled ones (like the libreadline/libncurses(w) et al) for everything that is executed within this shell instance
  2. Provide a very simple mapping scheme from OS/platform to package name on this OS/platform and specify a dictionary like object that simply contains this mapping Something like {'ubuntu': 'llibfoo-dev', 'el6': 'libfoo-devel'} in every easyconfig file
  3. Steal code from a python based build system like the excellent WAF. Having WAF in there gives EB access to autotools like (but better) functionality for configuration and just check for presence of foo.h, libbar.so in version >= x.y.z etc. Or use a proper pkg-config interface from WAF - the possibilities here are endless.

In WAF you can easily do things like this from a nice pythonic interface

def configure(conf):
    conf.load('compiler_cxx')
    conf.check(header_name='stdio.h', mandatory=False)
        conf.check_cfg (package='glib-2.0', uselib_store='GLIB', atleast_version='2.25.0',
                    args='--cflags --libs')
boegel commented 11 years ago

+1 on relying on the WAF functionality, that looks like a very promising and elegant way to resolve this issue

fgeorgatos commented 11 years ago

Not sure yet which direction is correct; we will only know after we try.

fyi. I recently saw this, just highlights the difficulty: http://lists.opensuse.org/opensuse-buildservice/2013-03/msg00117.html

On Thu, Apr 4, 2013 at 4:37 PM, Kenneth Hoste notifications@github.comwrote:

+1 on relying on the WAF functionality, that looks like a very promising and elegant way to resolve this issue

— Reply to this email directly or view it on GitHubhttps://github.com/hpcugent/easybuild-framework/issues/174#issuecomment-15900965 .

boegel commented 11 years ago

@fgeorgatos: That OpenSUSE post seems to confirm that relying on package names or even filenames of the pkgconfig-supplied .pc files is doomed to fail. So, actually checking for what you need, i.e. a particular library, header file, or whatever, is the correct solution, since that should be the same across distros. WAF seems to provide that functionality, via Python code, which is exactly what we need...

fgeorgatos commented 11 years ago

@boegel: I am with the impression that explicitly checking files will gasp at situations like: http://wiki.debian.org/Multiarch/TheCaseForMultiarch#A32.2F64_Architectures-> Current practices

ie. what fits under /lib or /usr/lib may or may not be 64bit, depending on your distro; this creates confusion, eg. https://bbs.archlinux.org/viewtopic.php?id=103509 Then, you could do a "file" to check on that...

baah... if I have to choose between trusting a funky hack or the sysadmin, I would prefer going for the later. After all, she pays a price when things brake, so I'd rather believe in the world of incentives, to get this problem treated...

On Thu, Apr 4, 2013 at 10:03 PM, Kenneth Hoste notifications@github.comwrote:

@fgeorgatos https://github.com/fgeorgatos: That OpenSUSE post seems to confirm that relying on package names or even filenames of the pkgconfig-supplied .pc files is doomed to fail. So, actually checking for what you need, i.e. a particular library, header file, or whatever, is the correct solution, since that should be the same across distros. WAF seems to provide that functionality, via Python code, which is exactly what we need...

— Reply to this email directly or view it on GitHubhttps://github.com/hpcugent/easybuild-framework/issues/174#issuecomment-15920348 .

boegel commented 11 years ago

@fgeorgatos: I'm not saying you should check for files in (a set of) hardcoded places, that's madness.

I'm saying you should check if the thing you need is actually there. If you need a particular library, try linking against it. If you need a particular header file, try including it and getting the code to compile. Basically, the autoconf (and WAF!) way.

Not only do you check whether the library you need is available on your system, you can also make sure if the compiler can find it, if it can be found at runtime if needed (e.g., if it's in $LD_LIBRARY_PATH), whether the linking knows how to link to it (important when using a non-default linker), etc.

I'm not saying we won't run into problems, but I'd expect them to be far less common and hard to resolve than depending on package names, for which there clearly is no consensus among distros.

fgeorgatos commented 11 years ago

Hi Ken,

On Fri, Apr 5, 2013 at 8:02 AM, Kenneth Hoste notifications@github.com wrote:

Basically, the autoconf (and WAF!) way.

OK, that makes it more clear!

You are basically implying to try a "fail-early" strategy, before the build starts; yeah, that makes sense.

Not only do you check whether the library you need is available on your system, you can also make sure if the compiler can find it, if it can be found at runtime if needed (e.g., if it's in $LD_LIBRARY_PATH), whether the linking knows how to link to it (important when using a non-default linker), etc. I'm not saying we won't run into problems, but I'd expect them to be far less common and hard to resolve than depending on package names, for which there clearly is no consensus among distros.

+1

I do not fully grasp if potentially another (extra) kind of check would make sense but, it certainly looks like correct strategy to give an early warning if something is about to fail.

in short, thanks for the clarification!

boegel commented 11 years ago

A nice example of OS dependencies that should be specified is https://github.com/hpcugent/easybuild-easyconfigs/pull/209. Also, having gcc and/or g++ as an OS dependency for GCC...

boegel commented 11 years ago

Great suggestion by @spikebike in https://github.com/hpcugent/easybuild-easyconfigs/issues/391:

Well for a full environment the ./configures for each app has significant magic for detecting the presence of and version of includes and libraries for a build. Seems silly to replicate that inside easybuild. Granted it's manual, but the number of packages involved is minimal. In the short term I suggest using what was posted on a related ticket:

+# needed for --with-openib
+if OS_NAME in ['redhat', 'fedora', 'RHEL', 'SL', 'centos']:
+    osdependencies = ['libibverbs-devel']
+elif OS_NAME in ['debian', 'ubuntu']:
+    osdependencies = ['libibverbs-dev']

In the longer term I suggest a puppet like approach where you can just name a easybuild named package and there's a mapping of easybuild package names -> distro specific package names.

Currently said table would look something like for ubuntu: zlibdevel -> zlib1g-dev libibverbs-devel -> libibverbs-dev qt4-devel -> libqt4-dev libX11-devel -> libX11-dev

The nice thing is this is only for OS provided packages, and that really shouldn't be too much more than stuff necessary to build GCC (like a distro provided GCC), X11 (which seems pointless to build in EB), and whatever is specific to the kernel, like the infiniband/ofed packages.

Although now that I think about it. Maybe for 2.0 that EB should actually write the puppet module and the environment module. Then we could push whatever puppet is best at into puppet. Although that would be more ideal for network services like ganeti or hadoop instead of HPC oriented applications and libraries.

akesandgren commented 6 years ago

@boegel Isn't this handled already?

boegel commented 6 years ago

Yes, see https://github.com/easybuilders/easybuild-framework/pull/846