Closed taylorreiter closed 7 years ago
Hi,
when you do makeDB.sh -e
, then the resulting database file is called kaiju_db_nr_euk.fmi
. Then when you call kaiju, you need to give that file name to the -f
option:
~/kaiju/bin/kaiju -t ~/kaijudb_e/kaijudb_e/nodes.dmp -f ~/kaijudb_e/kaijudb_e/kaiju_db_nr_euk.fmi ...
The reason is, that you can have different databases in the same directory without them overwriting each other. But probably I should probably remove the fixed kaiju_db.fmi from the help output of kaiju, because the file is only named like that for makeDB.sh -n
or -p
.
How silly of me! I can't believe I missed that. Thank you for your prompt reply.
A change in the help file would be helpful!
I have attempted to build and use the
makedb.sh -e
kaiju database. It has failed with the following output:$ ls -l ~/kaijudb_e/kaijudb_e/
total 167947636 -rw-rw-r-- 1 ubuntu ubuntu 33420490846 Mar 17 02:25 kaiju_db_nr_euk.bwt -rw-rw-r-- 1 ubuntu ubuntu 35484801934 Mar 17 01:00 kaiju_db_nr_euk.faa -rw-rw-r-- 1 ubuntu ubuntu 48369697476 Mar 17 02:37 kaiju_db_nr_euk.fmi -rw-rw-r-- 1 ubuntu ubuntu 9380484230 Mar 17 02:25 kaiju_db_nr_euk.sa -rw-r--r-- 1 ubuntu ubuntu 837101 Mar 16 23:20 merged.dmp -rw-r--r-- 1 ubuntu ubuntu 138034822 Mar 16 23:20 names.dmp -rw-r--r-- 1 ubuntu ubuntu 107384758 Mar 16 23:20 nodes.dmp -rw-rw-r-- 1 ubuntu ubuntu 27940725654 Mar 16 07:45 nr.gz -rw-rw-r-- 1 ubuntu ubuntu 14285367259 Mar 16 23:57 prot.accession2taxid -rw-rw-r-- 1 ubuntu ubuntu 2811685571 Mar 12 08:38 prot.accession2taxid.gz -rw-rw-r-- 1 ubuntu ubuntu 38816191 Mar 16 23:20 taxdump.tar.gz$ ~/kaiju/bin/kaiju -t ~/kaijudb_e/kaijudb_e/nodes.dmp -f ~/kaijudb_e/kaijudb_e/kaiju_db.fmi -i /mnt/work/hisat/unaligned/unaligned_SRR926282qc.fq -v -o kaiju_e_test
03:20:10 Reading database Reading taxonomic tree from file /home/ubuntu/kaijudb_e/kaijudb_e/nodes.dmp Reading index from file /home/ubuntu/kaijudb_e/kaijudb_e/kaiju_db.fmi Could not open file /home/ubuntu/kaijudb_e/kaijudb_e/kaiju_db.fmi Kaiju 1.5.0 Copyright 2015,2016 Peter Menzel, Anders Krogh License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.htmlUsage: /home/ubuntu/kaiju/bin/kaiju -t nodes.dmp -f kaiju_db.fmi -i reads.fastq [-j reads2.fastq]
Mandatory arguments: -t FILENAME Name of nodes.dmp file -f FILENAME Name of database (.fmi) file -i FILENAME Name of input file containing reads in FASTA or FASTQ format
Optional arguments: -j FILENAME Name of second input file for paired-end reads -o FILENAME Name of output file. If not specified, output will be printed to STDOUT -z INT Number of parallel threads (default: 1) -a STRING Run mode, either "mem" or "greedy" (default: mem) -e INT Number of mismatches allowed in Greedy mode (default: 0) -m INT Minimum match length (default: 11) -s INT Minimum match score in Greedy mode (default: 65) -x Enable SEG low complexity filter -p Input sequences are protein sequences -v Enable verbose output
I had ample ram to build the database, and the hard drive has an extra ~43 GB of space after the
makedb.sh -e
command finishes. Additionally, the output ofmakedb.sh -e
informs me that the building process has finished.I ran
makedb.sh -p
and it worked fine.Any guidance on this issue would be greatly appreciated.