cgat-developers / cgat-apps

cgat-apps repository
Other
33 stars 14 forks source link

ImportError: /lib/python3.6/site-packages/cgat/BamTools/bamtools.cpython-36m-x86_64-linux-gnu.so: undefined symbol: bam_read1 #43

Closed sebastian-luna-valero closed 4 years ago

sebastian-luna-valero commented 5 years ago

Hi,

I am trying to make a new release for cgat-apps in bioconda where I include sortedcontainers, following up with https://github.com/cgat-developers/cgat-apps/issues/42

The updated recipe is in a PR here https://github.com/bioconda/bioconda-recipes/pull/15752

I have now added a new test in the recipe to check for cgat bam2bed -h after building the cgat-apps package, and I am getting the error message: ImportError: /lib/python3.6/site-packages/cgat/BamTools/bamtools.cpython-36m-x86_64-linux-gnu.so: undefined symbol: bam_read1

I have tried a few options to make it work but I don't find a solution and I was wondering whether I could ask @AndreasHeger to have a look.

The issue seems to be when cythonizing bamtools. The strange thing is that cython does not give any errors in compilation time, but it fails at runtime. Moreover, it works on our Jenkins instance but it does not work on the bioconda container, and I think the only differences between the two environments are the C/C++ compilers (i.e. pysam, cython, etc. are on the same version)

For our reference, I had a look at:

Any ideas?

Best regards, Sebastian

Acribbs commented 5 years ago

Hi @sebastian-luna-valero could the last version of cgat-apps be marked as broken as im getting issues with conda installing 0.5.4 and giving undefined symbol: bam_read1?

sebastian-luna-valero commented 5 years ago

I am now asking for help in https://github.com/bioconda/bioconda-recipes/pull/15752

AndreasHeger commented 5 years ago

Thanks both. Not sure what is happening. I am surprised that cgat gtf2table works as it should also import pysam?

sebastian-luna-valero commented 5 years ago

Thanks, Andreas.

Could you confirm where the "bam_read1" function in bamtools.pyx is comming from?

AndreasHeger commented 5 years ago

This is from htslib. It looks as if the extension arguments in setup.py for GeneModelAnalysis (works) is different for BamTool.bamtools (fails). In the circle CI log this translates to two different commands:

.11:15:39 BIOCONDA INFO (OUT) x86_64-conda_cos6-linux-gnu-gcc -pthread
-shared -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro
-Wl,-z,now -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -Wl,-O2
-Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now
-Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -Wl,-O2 -Wl,--sort-common
-Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags
-Wl,--gc-sections -Wl,-rpath,$PREFIX/lib -Wl,-rpath-link,$PREFIX/lib
-L$PREFIX/lib -I$PREFIX/include -L$PREFIX/lib -march=nocona
-mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong
-fno-plt -O2 -ffunction-sections -pipe -I$PREFIX/include
-fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/cgat-apps-0.5.4
-fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix
-I$PREFIX/include -L$PREFIX/lib -DNDEBUG -D_FORTIFY_SOURCE=2 -O2
-I$PREFIX -I$PREFIX/include -L$PREFIX/lib
build/temp.linux-x86_64-3.6/cgat/GeneModelAnalysis.o -o
build/lib.linux-x86_64-3.6/cgat/GeneModelAnalysis.cpython-36m-x86_64-linux-gnu.so

vs

11:16:04 BIOCONDA INFO (OUT) x86_64-conda_cos6-linux-gnu-gcc -pthread
-shared -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro
-Wl,-z,now -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -Wl,-O2
-Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now
-Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -Wl,-O2 -Wl,--sort-common
-Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags
-Wl,--gc-sections -Wl,-rpath,$PREFIX/lib -Wl,-rpath-link,$PREFIX/lib
-L$PREFIX/lib -I$PREFIX/include -L$PREFIX/lib -march=nocona
-mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong
-fno-plt -O2 -ffunction-sections -pipe -I$PREFIX/include
-fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/cgat-apps-0.5.4
-fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix
-I$PREFIX/include -L$PREFIX/lib -DNDEBUG -D_FORTIFY_SOURCE=2 -O2
-I$PREFIX -I$PREFIX/include -L$PREFIX/lib
build/temp.linux-x86_64-3.6/cgat/BamTools/bamtools.o
-L$PREFIX/lib/python3.6/site-packages/pysam -L$PREFIX/lib
-lctabixproxies.cpython-36m-x86_64-linux-gnu
-lcfaidx.cpython-36m-x86_64-linux-gnu
-lcsamfile.cpython-36m-x86_64-linux-gnu
-lcvcf.cpython-36m-x86_64-linux-gnu
-lcbcf.cpython-36m-x86_64-linux-gnu
-lctabix.cpython-36m-x86_64-linux-gnu
-lchtslib.cpython-36m-x86_64-linux-gnu -o
build/lib.linux-x86_64-3.6/cgat/BamTools/bamtools.cpython-36m-x86_64-linux-gnu.so
-Wl,-rpath,$PREFIX/lib/

note the difference to the linker arguments. However, it is bamtools.pyx which I thought should be correct.

AndreasHeger commented 5 years ago

Strange

AndreasHeger commented 5 years ago

I am not set up for testing this, need to install all the cools again.

sebastian-luna-valero commented 5 years ago

Not sure whether it helps, but this is what I find in setup.py:

    Extension(
        "cgat.GeneModelAnalysis",
        ["cgat/GeneModelAnalysis.pyx"],
        include_dirs=pysam.get_include() + [numpy.get_include()],
        library_dirs=[],
        libraries=[],
        define_macros=pysam.get_defines(),
        language="c",
    ),

vs.

    Extension(
        "cgat.BamTools.bamtools",
        ["cgat/BamTools/bamtools.pyx"],
        include_dirs=pysam.get_include() + [numpy.get_include()],
        library_dirs=pysam_libdirs,
        libraries=pysam_libs,
        define_macros=pysam.get_defines(),
        language="c",
        extra_link_args=extra_link_args_pysam,
    ),

Is that correct? If not, please tell me and I am happy to patch setup.py to check whether it works.

AndreasHeger commented 5 years ago

Actually, just recalled, GeneModelAnalysis does not use htslib functions directly but only the standard pysam python API functions. So it is correct. Trying to set up circle-CI, etc to do local testing.

sebastian-luna-valero commented 5 years ago

Anything else I could try to help? For example, can I add another test to the recipe to get more info from the bioconda container?

sebastian-luna-valero commented 5 years ago

Also, can you tell from the logs which htslib is being picked up, the stand-alone version or the pysam one?

AndreasHeger commented 5 years ago

Thanks, unfortunately we can't tell, the only I know way I know about how to debug these things is to use ldd and nm in the environment that fails. You could add ldd xyz/xyz/bamtools.so the tests, but getting the path right might need some experimenting.

AndreasHeger commented 5 years ago

It is probably about the compiler being used as you expect.

sebastian-luna-valero commented 5 years ago

Thanks.

Currently in the bioconda recipe we have the test section like this:

test:
    - cgat --help
    - cgat --help Conversion
    - cgat gtf2table -h
    - cgat bam2bed -h

Could you remind me of a different cgat tool that we can test to double check the issue, please?

sebastian-luna-valero commented 5 years ago

Also, I remember I used:

export CFLAGS="-I$PREFIX/include -DHAVE_LIBDEFLATE"
export CPPFLAGS="-I$PREFIX/include -DHAVE_LIBDEFLATE"

export HTSLIB_LIBRARY_DIR=$PREFIX/lib
export HTSLIB_INCLUDE_DIR=$PREFIX/include

in the build.sh of the bioconda recipe. Do you think we need to keep this? or is it no longer needed?

FYI: http://pysam.readthedocs.org/en/latest/installation.html#external

AndreasHeger commented 5 years ago

Hi Sebastian, I am not sure, but I hoped that this can be taken care of setup.py querying pysam.get_includes(), etc.

AndreasHeger commented 5 years ago

I had no luck with the circle ci build, but I can reproduce the problem locally with bioconda-utils.

AndreasHeger commented 5 years ago

bam2stats is another tool that would cause issues.

AndreasHeger commented 5 years ago

I have added a [hts] to the libraries to be linked against.

AndreasHeger commented 5 years ago

See #44

Charlie-George commented 4 years ago

This still seems to be a problem -

conda install -c conda-forge -c bioconda cgat-apps installs cgat-apps 5.5

`(test_cgat_apps) [charlotteg@cgath1 cloneseq]$ python Python 3.7.1 (default, Oct 23 2018, 19:19:42) [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information.

from cgat.BamTools.bamtools import bam2stats_count Traceback (most recent call last): File "", line 1, in ModuleNotFoundError: No module named 'cgat.BamTools.bamtools'`

or

sebastian-luna-valero commented 4 years ago

Hi Charlie,

Strange, I have just installed cgat-apps 0.5.5 and that import works for me.

Could you please copy and paste the output of:

which cgat
which conda
conda list

Best regards, Sebastian

paulbrodersen commented 4 years ago

Hi Sebastian,

I am Paul, a new member in David Sims group, and it's actually me that is having the issue. Charlie was kindly helping me debug and in the process raised the issue.

To answer your questions:

which cgat
# /ifs/devel/paulb/miniconda3/envs/cloneseq/bin/cgat
which conda
# /ifs/devel/paulb/miniconda3/condabin/conda

The output of conda list is here.

To setup this conda environment, I used the the environment yaml file that is being used for teaching the cgat course as a base and added a few dependencies that seemed to be missing, specifically bwa, bamtools, and cgat-apps. The latest version of cgat-apps was installed.

Other things that I have tried since:

  1. Downgrading to cgat-apps 0.5.4: same issue.
  2. Downgrading to cgat-apps 0.5.3: different issue, but maybe less interesting as I doubt that you will want to roll back that far.
  3. Replacing cgat-apps 0.5.5 downloaded from bioconda with the version from github with
    conda remove cgat
    cd /my/github/clone/of/cgat-apps
    python setup.py develop

    This fails with compilation errors.

Thanks for the help, I appreciate it. I just got here and trying to understand the pipeline without having used it is a bit daunting, especially since Charlie and David are teaching a course at the moment and don't have much time to help.

Best, Paul

Acribbs commented 4 years ago

I think the conda environment yaml file may be old. It looks like from the output of conda list you have cgat-apps 0.5.4, which was the version where the issue you report happens.

I will be over at the wimm in a few hours if you would like some help. I will try and install a new conda environment for you.

Adam

paulbrodersen commented 4 years ago

I certainly would appreciate any help I can get. I am the tall, bearded guy sitting approximately opposite Charlie.

Regarding the output from conda list, I posted the results for the original environment that for some reason still downloaded the old version from bioconda. I do have another environment (I have made 4 or 5 by now to find a way around the issue) where cgat-apps is of the latest version. IIRC, that did not solve the issue.

conda list | grep cgat
# cgat                      0.5.5                     dev_0    <develop>
# cgat-apps                 0.5.4            py36h84d0fb7_0    bioconda
# cgatcore                  0.5.15                     py_0    bioconda
Acribbs commented 4 years ago

I will be back over at 2, is that ok?

paulbrodersen commented 4 years ago

Yeah, that work well for me, thanks!

Charlie-George commented 4 years ago

@sebastian-luna-valero

Actually my issue is to do with the channel priorities i think.....

The import works fine If i do

conda create -n test_env1 conda activate test_env1 conda install cgat-apps

However if i follow the bioconda instructions that specifies my channel order the import statement breaks

conda create -n test_env2 conda activate test_env2 conda install -c conda-forge -c bioconda cgat-apps

by default my channel order is:

channel URLs : https://conda.anaconda.org/bioconda/linux-64 https://conda.anaconda.org/bioconda/noarch https://conda.anaconda.org/conda-forge/linux-64 https://conda.anaconda.org/conda-forge/noarch https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch

sebastian-luna-valero commented 4 years ago

Hi @Charlie-George

After the following pull requests:

we changed the conda channel orders from:

to:

However, I did try installing cgat-apps with both combinations and it worked for me; Strange!

Have you recently pulled the master branch into your devel folder?

Acribbs commented 4 years ago

Hi sebastian,

I have cleanly installed cgat-apps into a new environment using the install.sh script and I get the following errors:

File "/ifs/devel/adamc/cgat-developers/cgat-apps/cgat/tools/bam2stats.py", line 315, in <module>
    from cgat.BamTools.bamtools import bam2stats_count
ModuleNotFoundError: No module named 'cgat.BamTools.bamtools'

Also, tests on master branch are failing and im not sure if this is related to this, probably not since there is output created.

sebastian-luna-valero commented 4 years ago

Hi Adam,

What command produces that error?

I have just tried a new cgat-apps installation with the install.sh script and it works for me if I do:

cgat bam2stats -h
python -c "from cgat.BamTools.bamtools import bam2stats_count"

Could you please try a new install and let me know how it goes?

Best regards, Sebastian

Acribbs commented 4 years ago

Hi Sebastian,

This was running bam2stats tests (I think the paired one)

Since we have all green again (Thanks for your help) I will re-install and try again.

BW, Adam