Closed charlesreid1 closed 6 years ago
Hi, yes you need to upgrade the Kaiju version. Only from version 1.6.0 can gzip files be read directly by Kaiju. Maybe that caused the segfault.
Or it could be too less RAM, I would recommend 60 GB RAM when using the NR+euk kaiju_db_nr_euk.fmi database.
Thanks for the input! It sounds like both things were issues, as I only had 32 GB of RAM on the machine I was using.
Problem was resolved by switching the docker run command to use kaiju 1.6.1, and running on a node with 64 GB ram (instead of 32 GB):
docker run -v ${PWD}:/data quay.io/biocontainers/kaiju:1.6.1--pl5.22.0_0 \
kaiju \
-x \
-v \
-t /data/kaijudb/nodes.dmp \
-f /data/kaijudb/kaiju_db_nr_euk.fmi \
-i /data/${base}_1.trim2.fq.gz \
-j /data/${base}_2.trim2.fq.gz \
-o /data/${base}.kaiju_output.trim2.out \
-z 4
Thanks again @pmenzel!
Hi, good that this problem was solved. Could I bother you with another MacOS problem? ;)
There is a problem in makeDB.sh, where I use option -i for xargs, which does not work in MacOS. Also it's deprecated in Linux, so I want to replace it. It affects two lines in makeDB.sh.
I would like to replace the xargs part of these lines with:
xargs -n 1 -P $parallelConversions -IXX gbk2faa.pl XX XX.faa
It works in Linux, but need to test it in MacOS too.. Could you please quickly try it?
Just run the progenomes option:
makeDB.sh -p -v
with the modified lines 238 and 256 in makeDB.sh.
That would be great, Peter
Yes, happy to help. Continued in issue 61's thread.
Am currently working through the taxonomic workflow as part of #45 (see this fork of dahak for a better-formatted version of the taxonomic classification workflow and the scripts folder of the dahak-yeti repository containing scripts for running dahak on AWS nodes), and have made it nearly all the way through the workflow. However, I am experiencing an issue with the Kaiju container. It is supposed to output a file used by the next step in the workflow, but no file is being output.
Command being run:
where
${base}
is something likeSR606249
.Expected behavior
-o
flag indicates this should output a file:Actual behavior
No file is output by the process. I see the following messages printed by the container while it is running:
The input files
nodes.dmp
andkaiju_db_nr_euk.fmi
are both present in the container. There is no output file created.To debug, I changed the docker run line above to:
This gives an interactive prompt inside the container. From there I verified the input files were mounted correctly, and I ran the Kaiju command:
When I did this, I saw a segmentation fault.
Steps to reproduce the behavior
This is difficult to reproduce, because it requires generating and downloading very large files.
See the workflow steps.
Possible Resolution
I suspect the problem may be with the Kaiju version being used: this points to an old quay.io biocontainer version of Kaiju,
quay.io/biocontainers/kaiju:1.5.0--pl5.22.0_0
. The most recent version of Kaiju (1.6.1) was recently added to biocontainers via bioconda/bioconda-recipes#7213 so we should probably leverage that somehow.