genetics-statistics / GEMMA

Genome-wide Efficient Mixed Model Association
https://github.com/genetics-statistics/GEMMA
GNU General Public License v3.0
330 stars 124 forks source link

0.97 Installation problems (unix modules) #142

Closed mpw6 closed 6 years ago

mpw6 commented 6 years ago

I have attempted to install both the haswell and generic precompiled binaries as well as building from source, all without luck.

The instructions here for haswell: https://groups.google.com/forum/#!topic/gemma-discussion/2NPWKhh3ixE do not work as written.

The generic version does install, but has library errors:

14:34:47 ~/packages/gemma $ ./install.sh /isg/shared/apps/gemma/0.97

gnu-install-bin 0.2.1 Copyright (C) 2017 Pjotr Prins <pjotr.prins@thebird.nl> and the GNU Guix project.
    See also https://gitlab.com/pjotrp/gnu-install-bin

  Installation to /isg/shared/apps/gemma/0.97 completed! Add the binaries to
  the path with:

    export PATH=/isg/shared/apps/gemma/0.97/bin:$PATH

  (more instructions are available at https://gitlab.com/pjotrp/gnu-install-bin/blob/master/README.md)

  The following binaries are available:

/isg/shared/apps/gemma/0.97/bin/gemma
Done
14:34:53 ~/packages/gemma $ /isg/shared/apps/gemma/0.97/bin/gemma
/isg/shared/apps/gemma/0.97/bin/gemma: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /isg/shared/apps/gemma/0.97/bin/gemma)
/isg/shared/apps/gemma/0.97/bin/gemma: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /isg/shared/apps/gemma/0.97/bin/gemma)
pjotrp commented 6 years ago

cat you paste the results of

    cat /proc/version
mpw6 commented 6 years ago
Linux version 3.10.0-693.5.2.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Fri Oct 20 20:32:50 UTC 2017
pcarbo commented 6 years ago

@mpw6 Also please send the output from ldd ./gemma. The glibc library that is listed should be within the same directory that GEMMA 0.97 is installed. If not, it is possible that it is using another version of the library that is included in your LD_LIBRARY_PATH. Do you use Anaconda or Miniconda?

pjotrp commented 6 years ago

Sorry for the delay, I was teaching and it scrambled my time and brain. @pcarbo is right: shared library resolution can suffer from environment 'bleeding'. The best way to test this is to run in a completely clean shell, e.g.

       env -i /bin/bash --login --noprofile --norc

and run gemma. @mpw6 does this work?

I'll see if I can provide a static compiled version too. Should not be too hard now.

mpw6 commented 6 years ago

I could not even get 0.97 to install. So I can't send the ldd. I am attempting to create an environment module for a shared HPC cluster. I found a stable version of 0.96 and have made that available until we can get 0.97 working. static libraries are always best when dealing with environment modules.

pcarbo commented 6 years ago

@mpw6 What is the error you get when you try to install GEMMA 0.97?

pjotrp commented 6 years ago

Yes, I modules bleed in ld paths. Do you mind installing using above clean shell and then running gemma inside the clean shell? The module can be clean too, but better see it works first.

pjotrp commented 6 years ago

@mpw6 have you tried installing inside

       env -i /bin/bash --login --noprofile --norc
mpw6 commented 6 years ago

It will install and run only inside the clean shell:

15:26:59 ~ $ env -i /bin/bash --login --noprofile --norc
bash-4.2$ /isg/shared/apps/gemma/0.97/bin/gemma
GEMMA 0.97 (2017/12/27) by Xiang Zhou and team (C) 2012-2017

 type ./gemma -h [num] for detailed help
 options:
  1: quick guide
  2: file I/O related
  3: SNP QC
  4: calculate relatedness matrix
  5: perform eigen decomposition
  6: perform variance component estimation
  7: fit a linear model
  8: fit a linear mixed model
  9: fit a multivariate linear mixed model
 10: fit a Bayesian sparse linear mixed model
 11: obtain predicted values
 12: calculate snp variance covariance
 13: note
 14: debug options

The GEMMA software is distributed under the GNU General Public v3
   -license    show license information
   see also http://www.xzlab.org/software.html, https://github.com/genetics-statistics
bash-4.2$ logout
15:29:52 ~ $ /isg/shared/apps/gemma/0.97/bin/gemma
/isg/shared/apps/gemma/0.97/bin/gemma: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /isg/shared/apps/gemma/0.97/bin/gemma)
/isg/shared/apps/gemma/0.97/bin/gemma: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /isg/shared/apps/gemma/0.97/bin/gemma)
15:30:03 ~ $

Unfortunately this will be confusing to users. Is there a way around this problem?

pjotrp commented 6 years ago

Cool and thanks for trying. The good news is that it is an environment issue.

The reason it fails on modules is that they rewrite the library search path(s). We ought to find a solution because future versions of gemma will include more tools and a static linked binary will not do. We'll probably provide a static gemma (again), but it will be limited in its capabilities.

To solve this issue we need to be able to run a module with a clean(er) shell. I suspect LD_LIBRARY_PATH is set. Can you unset it? Maybe also have a look at other environment variables pointing into /lib64.

set|grep lib64

just unset them one by one and see if the problem persists.

pcarbo commented 6 years ago

This sort of problem happens all the time when using dynamic libraries. This is the motivation for using conda environments, for example.

@mpw6 I would expect that one or more of the libraries in the gemma folder can also be found in the your LD_LIBRARY_PATH, creating the conflict.

pjotrp commented 6 years ago

Conda does not escape the shared library problem. See https://conda.io/docs/user-guide/tasks/build-packages/use-shared-libraries.html. If you want real software isolation you should opt for Nix or Guix, like the Canadian HPCs are doing. See https://fosdem.org/2018/schedule/event/computecanada/.

These systems hard code the shared library paths, so they don't have to be looked up. The HPCs I am involved in are opting for GNU Guix. Modules are no longer required. I am not saying you should try it, but it is a peek into the future for complex deployments. Deployment complexity is increasing over time.

mpw6 commented 6 years ago

Ok, so I see that you install libraries in various subdirectories. It really seems to me that I just need to prepend the LD_LIBRARY_PATH with the paths to the libraries that your installer includes.

pcarbo commented 6 years ago

@mpw6 Yes, that is one approach. That isn't the most elegant solution, but it is reasonable to me.

mpw6 commented 6 years ago

That is my understanding of how to best use environment modules. We prepend the LD_LIBRARY_PATH when needed, and then when the module is removed, we return to the default values.

mpw6 commented 6 years ago

I got it working, but I don't understand why you separate all the library files into distinct directories instead of making a single lib dir to hold them all. It's much more difficult to manage the way you've implemented it. Please consider standardizing to aid installation.

pjotrp commented 6 years ago

Fair point. I could add a ./lib directory and symlink into the others. I added an issue https://gitlab.com/pjotrp/gnu-install-bin/issues/2

Even so, note that the use of LD_LIBRARY_PATH is not without dangers. If you search for it you'll find many reasons why it should not be used to 'fix' paths. Software should resolve its own paths or use the system loader. GEMMA install does exactly that, but when you set LD_LIBRARY_PATH using modules it will find other libs first. That is what breaks the install. Using LD_LIBRARY_PATH to fix it again is pretty much a hack. LD_LIBRARY_PATH is not composable.