healthyPlant / PhytoPipe

10 stars 1 forks source link

Kaiju segmentation fault in Docker image #2

Open xhu556 opened 1 year ago

xhu556 commented 1 year ago

Kaiju failed when I tested Dataset_8 using the PhytoPipe docker image. Here are error messages

/usr/bin/bash: line 5:    10 Segmentation fault      (core dumped) kaiju -z 16 -t /opt/phytopipe/db/ncbi/taxonomy/nodes.dmp -f /opt/phytopipe/db/kaiju_db/kaiju_db_nr_euk.fmi -i /data/trimmed/Dataset_8_R1.trimmed.fastq.gz -j /data/trimmed/Dataset_8_R2.trimmed.fastq.gz -o /data/classification/Dataset_8.kaiju.out &>> /data/logs/kaiju/Dataset_8.log

Can you fix it?

xhu556 commented 1 year ago

First let's test whether Kaiju works in the Docker container

workDir=/disks/data1/pp_output
#kaiju database file
kaijuDb_dir=/disks/data01/PhytoPipe_database/kaiju_db #/kaiju_db_nr_euk.fmi
#NCBI taxonomy database
taxDb_dir=/disks/data01/PhytoPipe_database/NCBI_tax

docker run -it --rm \
           -v $workDir:/data \
           -v $kaijuDb_dir:/opt/phytopipe/db/kaiju_db \
           -v $taxDb_dir:/opt/phytopipe/db/ncbi/taxonomy \
           healthyplant/phytopipe \
           kaiju -z 16 -t /opt/phytopipe/db/ncbi/taxonomy/nodes.dmp \
           -f /opt/phytopipe/db/kaiju_db/kaiju_db_nr_euk.fmi \
           -i /data/trimmed/Dataset_8_R1.trimmed.fastq.gz -j /data/trimmed/Dataset_8_R2.trimmed.fastq.gz \
           -o /data/classification/Dataset_8.kaiju.out

If you see "Segmentation fault" again, I'm afraid it's caused by the computer memory limit. Let's test a small Kaiju database. You can download the virus database from Kaiju webserver.

mkdir /disks/data01/PhytoPipe_database/kaiju_db_viruses
cd /disks/data01/PhytoPipe_database/kaiju_db_viruses
wget https://kaiju.binf.ku.dk/database/kaiju_db_viruses_2022-03-29.tgz
tar -xvzf kaiju_db_viruses_2022-03-29.tgz
ln -s kaiju_db_viruses.fmi kaiju_db_nr_euk.fmi

Now we can use the new Kaiju database to test Kaiju

workDir=/disks/data1/pp_output
kaijuDb_dir=/disks/data01/PhytoPipe_database/kaiju_db_virsues
docker run -it --rm \
           -v $workDir:/data \
           -v $kaijuDb_dir:/opt/phytopipe/db/kaiju_db \
           healthyplant/phytopipe \
           kaiju -z 16 -t /opt/phytopipe/db/kaiju_db/nodes.dmp \
           -f /opt/phytopipe/db/kaiju_db/kaiju_db_nr_euk.fmi \
           -i /data/trimmed/Dataset_8_R1.trimmed.fastq.gz -j /data/trimmed/Dataset_8_R2.trimmed.fastq.gz \
           -o /data/classification/Dataset_8.kaiju.out

If the above command works, please use the new Kaiju database to run PhytoPipe again.
If it doesn't work, Kaiju program could have problems.