Closed mayuefine closed 2 months ago
Ok, this seems like you are missing more than just libbz2 in your conda environment. You would need to install g++-10, gcc-10, build-essential and libbz2-dev if you want to compile and build taxor on your own.
I also uploaded an updated binary of Taxor using the static bzip2 library. Please try it out and tell me if it worked or throws any other error messages.
I also uploaded an updated binary of Taxor using the static bzip2 library. Please try it out and tell me if it worked or throws any other error messages.
Hi jens, I download you new binary of Taxor, when I directly run it,I still meet same errors as you can see here, I have installed the libstdc++ in my conda envs, and all the version are required
But I cannot install the libm.so.6 it's look like need update whole service system, which I cannot do that, so it is possible to put these together with taxor when compiling it? or could you provide a docker container about this software.
by the way, I also tried compile it by myself, but it's give me other error: it's says cannot found SDSL library, but this library is there.
Right now I am using singularity to build an new version of OS, to run this software, it's OK for the current testing phase.
singularity build --sandbox ubuntu docker://ubuntu:22.04
cp libgomp.so.1 ubuntu/lib/x86_64-linux-gnu/
singularity build ubuntu.sif ubuntu/
singularity run --bind ./:/mnt ubuntu.sif bash
It's worked within singularity virtual system, see below:
If you can offering a docker container or something like that to help me run this software in normal way that will be great.
Thanks a lot!
It seems like you are using older GCC and G++ versions on the system. However, I created a docker image, which you can find here. I hope this solves your issues.
Unfortunately it's not clear to me from the error message which shared library in the docker image is missing. Did singularity tell you more about tha?
In the meantime, could you try conda install libgcc
before executing the taxor binary on your system?
BTW: I could successfully run the docker container image even on my windows laptop. So it seems more like an issue with singularity
Hi,
I recently tried to add some static libraries to the binary executable of Taxor. Now it includes the zlib, glibc and libstdc++. You may want to give it a try and report if it works or not. Unfortunately, all my Linux distributions have no issues with the executables and I could not reproduce your issues.
Unfortunately, the latest binary still failed to run on our server.
It said
$ taxor
taxor: error while loading shared libraries: libbz2.so.1.0: cannot open shared object file: No such file or directory
So I created a conda environment and installed bzip2
$ mamba create -c conda-forge -n taxor python==3.9 bzip2
$ mamba activate taxor
However, it still reported the same error. So I configured the LD_LIBRARY_PATH
.
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/homes/shenwei/ws/app/miniconda3/envs/taxor/lib
Hmm, new error about the glibc version :
$ taxor
taxor: /lib64/libm.so.6: version `GLIBC_2.29' not found (required by taxor)
$ ldd --version ldd
ldd (GNU libc) 2.28
Copyright (C) 2018 Free Software Foundation, Inc.
Then I installed a glibc (v2.33). While it still failed. I can't do anything with the Segmentation fault
.
$ mamba install vikky34v::glibc
$ taxor
Segmentation fault (core dumped)
Yeah, this is a weird behavior on some platforms when using the precompiled binary. I still need to figure out why the math library is making troubles. In the meantime, as you are using conda anyway, I recommend you to use the taxor bioconda package. This should fix your issue.
Good to know that! The conda way should be added to the Installation section.
BTW. In the Installation section of README, "executable binaries" links to KMCP.
Oh damn...Thanks for pointing that out. I immediately resolved that and added the conda as in installation option.
unfortunately, it failed to run with some sample data.
$ seqkit stats --quiet *.fq.gz
file format type num_seqs sum_len min_len avg_len max_len
hifi-zymo-d6331.fq.gz FASTQ DNA 10,001 91,388,406 1,765 9,137.9 31,968
ont-q20-zymo-d6300.fq.gz FASTQ DNA 10,001 32,834,938 144 3,283.2 33,483
ont-r10-zymo-d6300.fq.gz FASTQ DNA 10,001 40,800,157 124 4,079.6 98,819
$ taxor search --index-file hifi-zymo-d6331.fq.gz --query-file hifi-zymo-d6331.fq.gz
use kmer-model
Segmentation fault (core dumped)
The index file should be an indexed database (created with the build
command) and not a fastq file ;-)
Oh sorry, that line resulted from the wrong input. While it exits unexpectedly as well with gtdb-R214-k22-s12.hixf
. The file might be corrupted, I'm downloading it. Maybe you could provide md5sum with the index files.
$ taxor search --index-file gtdb-R214-k22-s12.hixf --query-file hifi-zymo-d6331.fq.gz
use syncmer model
Segmentation fault (core dumped)
The other indexes worked!
$time taxor search --index-file refseq-abfv-k22-s12.hixf \
--query-file hifi-zymo-d6331.fq.gz --output-file t.txt --threads 32
BTW, if --output-file
is not given (the default value is .
), nothing is outputted neither in stdout or any file.
The error remains with redownloaded genbank-viral-k22-s12.hixf2
.
$ time taxor search --index-file genbank-viral-k22-s12.hixf2 \
--query-file hifi-zymo-d6331.fq.gz --output-file t2.txt
Segmentation fault (core dumped)
gtdb-R214-k22-s12.hixf / genbank-viral-k22-s12.hixf2
file size: 76626998529
md5sum: f2875ac5f1e017d48220bc42954c6541
The md5sum is correct. Can you please check the taxor version, you are using.
Installed with conda.
VERSION
Last update:
taxor version: 0.1.2
SeqAn version: 3.4.0-rc.1
I don't know what happened with the files but I rebuilt the indexes, tested, and uploaded them again. At least on our cluster, they work as expected. Please contact me again if they still cause a segmentation fault.
Same error.
$ time taxor search --index-file genbank-viral-k22-s12.hixf3 \
--query-file hifi-zymo-d6331.fq.gz --output-file t3.txt
Segmentation fault (core dumped)
$ ls -l gtdb-R214-k22-s12.hixf*
-rw-r--r-- 1 shenwei iqbal 76626998529 Mar 14 12:54 gtdb-R214-k22-s12.hixf
-rw-r--r-- 1 shenwei iqbal 76626998529 Mar 14 15:59 gtdb-R214-k22-s12.hixf2
-rw-r--r-- 1 shenwei iqbal 76626998531 Mar 19 08:25 gtdb-R214-k22-s12.hixf3
Here's the environment:
$ conda list
# packages in environment at /homes/shenwei/ws/app/miniconda3/envs/taxor2:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
bzip2 1.0.8 hd590300_5 conda-forge
c-ares 1.27.0 hd590300_0 conda-forge
ca-certificates 2024.2.2 hbcca054_0 conda-forge
coreutils 9.4 hd590300_0 conda-forge
curl 8.5.0 hca28451_0 conda-forge
diffutils 3.10 hf18258e_0 conda-forge
grep 3.11 h3cbd922_0 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
krb5 1.21.2 h659d440_0 conda-forge
libcurl 8.5.0 hca28451_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 hd590300_2 conda-forge
libgcc-ng 13.2.0 h807b86a_5 conda-forge
libgomp 13.2.0 h807b86a_5 conda-forge
libiconv 1.17 hd590300_2 conda-forge
libnghttp2 1.58.0 h47da74e_1 conda-forge
libssh2 1.11.0 h0841786_0 conda-forge
libstdcxx-ng 13.2.0 h7e041cc_5 conda-forge
libzlib 1.2.13 hd590300_5 conda-forge
ncurses 6.4 h59595ed_2 conda-forge
openssl 3.2.1 hd590300_0 conda-forge
pcre2 10.40 hc3806b6_0 conda-forge
taxor 0.1.2 hc155240_0 bioconda
zstd 1.5.5 hfc55251_0 conda-forge
I created a conda environtment with miniconda3 for taxor on a different laptop and had no issues when using the genbank -viral index. When looking at you package versions, the only two things that I recognized was that your environment is missing the libgcc
package and that coreutils
have a different version. These two could be the problem. I further added MD5 hashes for the database indexes to the Taxor Readme file on GitHub. May also check if it's correct. I downloaded the genbank-viral database index and it worked for me with the following pckages in my environment.
_libgcc_mutex-0.1 | conda_forge 3 KB conda-forge
_openmp_mutex-4.5 | 2_gnu 23 KB conda-forge
bzip2-1.0.8 | hd590300_5 248 KB conda-forge
c-ares-1.27.0 | hd590300_0 160 KB conda-forge
ca-certificates-2024.2.2 | hbcca054_0 152 KB conda-forge
coreutils-8.25 | 1 9.1 MB bioconda
curl-8.6.0 | hca28451_0 91 KB conda-forge
diffutils-3.10 | hf18258e_0 391 KB conda-forge
grep-3.4 | hf43ccf4_4 255 KB bioconda
keyutils-1.6.1 | h166bdaf_0 115 KB conda-forge
krb5-1.21.2 | h659d440_0 1.3 MB conda-forge
libcurl-8.6.0 | hca28451_0 382 KB conda-forge
libedit-3.1.20191231 | he28a2e2_2 121 KB conda-forge
libev-4.33 | hd590300_2 110 KB conda-forge
libgcc-7.2.0 | h69d50b8_2 304 KB conda-forge
libgcc-ng-13.2.0 | h807b86a_5 752 KB conda-forge
libgomp-13.2.0 | h807b86a_5 410 KB conda-forge
libiconv-1.17 | hd590300_2 689 KB conda-forge
libnghttp2-1.58.0 | h47da74e_1 617 KB conda-forge
libssh2-1.11.0 | h0841786_0 265 KB conda-forge
libstdcxx-ng-13.2.0 | h7e041cc_5 3.7 MB conda-forge
libzlib-1.2.13 | hd590300_5 60 KB conda-forge
ncurses-6.4.20240210 | h59595ed_0 875 KB conda-forge
openssl-3.2.1 | hd590300_1 2.7 MB conda-forge
pcre-8.45 | h9c3ff4c_0 253 KB conda-forge
taxor-0.1.2 | hc155240_0 3.3 MB bioconda
zstd-1.5.5 | hfc55251_0 532 KB conda-forge
Hi @JensUweUlrich,
I'm having the same issue - Segmentation fault (core dumped) when I run
<taxor search --index-file databases/gtdb-R214-k22-s12.hixf --query-file BC07_cdna2_filtered.fastq --output-file test.txt --error-rate 0.99 --threads 4>
VERSION Last update: taxor version: 0.1.3 SeqAn version: 3.4.0-rc.1
` The package lists are same as above but still encouter the seqmentation error - please advise.
Thanks
@bioinfo17
Did you check if the md5sum of your downloaded index file is correct? I also recognized that seom packages have different versions than the one I had in my conda environment. So you could also try to change to taxor version 0.1.2 or zstd vrsion 1.5.5.
Looked like the downloaded database was the issue - works well with databases build from scratch - thanks
Hello,
I built database using
taxor build --input-file ../../taxor/taxor_input.tsv --input-sequence-dir . --output-filename refseq-PBFAV-VK --kmer-size 22 --syncmer-size 12 --threads 30
i get this error
64 1.00 4.58 1.00 4.58 296.9GiB 128 0.96 4.48 1.01 4.52 299.6GiB 256 1.20 3.14 0.92 2.89 273.5GiB 512 1.71 3.84 0.87 3.34 257.8GiB Best t_max (regarding expected query runtime): 256 write Layout header layout created terminate called after throwing an instance of 'std::invalid_argument' what(): The size of the shape cannot be greater than the window size. Aborted (core dumped)
VERSION
Last update:
taxor-build version: 0.1.3
SeqAn version: 3.4.0-rc.1
@bioinfo17 can you share your commands for building database with scracth. I downloaded my genomes as mentioned in document. But i still get the above error.
Dear Jens-Uwe Ulrich,
I am notice you already provided a compiled version, but when I use it, it's says "./taxor: error while loading shared libraries: libbz2.so.1.0: cannot open shared object file: No such file or directory".
It looks like the software needs some libraries from the conda environment, and I don't have the root account of the cluster. And also those libraries cannot be installed by myself.
Best wishes, Yue Ma