tseemann / nullarbor

:floppy_disk: :page_with_curl: "Reads to report" for public health and clinical microbiology
GNU General Public License v2.0
134 stars 37 forks source link

conda package dependency issue - kraken & kraken2 #216

Closed pansapiens closed 4 years ago

pansapiens commented 5 years ago

I've hit an issue with the current bioconda package - I'm seeing:

/opt/conda/libexec/classify: invalid option -- 'd'
Usage: classify [options] <fasta/fastq file(s)>

in the logs when nullarbor attempts to run kraken.

I believe the issue could be that both the kraken and kraken2 packages are installed, and the kraken2 package might be clobbering the classify binary from kraken (v1.x).

A solution is to do:

conda install -y -c bioconda -c conda-forge nullarbor
conda install -y -c bioconda kraken --force-reinstall

Force re-installing kraken seems to give a classify binary it's happy with.

tseemann commented 5 years ago

I do not control the bioconda package.

I will endeavour to add this to the manual to help others who encounter this problem!

FYI - I will be deprecating Kraken v1 support soon - Kraken v2 is ~30x more disk/RAM efficient.

Thanks Andrew!

tseemann commented 4 years ago

The conda recipes are now fixed. conda update --all in your env!

pomidorku commented 4 years ago

Sir, I installed kraken2 (conda install kraken2), but I do not see the database. I would like to add some taxa to the database. Should I download the kraken2 database separately and update it?

pansapiens commented 4 years ago

You can download one of the pre-generated 'Minikraken' databases for Kraken2 from here: https://ccb.jhu.edu/software/kraken2/index.shtml?t=downloads

Specify the path to the database files using the --db option, eg: kraken2 --db /path/to/untarred/minikraken2_v2_8GB_201904_UPDATE.

You can alternatively set the database path using the environment variable export KRAKEN2_DEFAULT_DB =/path/to/untarred/minikraken2_v2_8GB_201904_UPDATE (this is the way I do it when running nullarbor).

To create your own database, consult the kraken2 manual for instructions: https://ccb.jhu.edu/software/kraken2/index.shtml?t=manual#custom-databases

tseemann commented 4 years ago

@pomidorku there are instructions on how to download and setup all 3 databases: https://github.com/tseemann/nullarbor#databases

pomidorku commented 4 years ago

Thank you, I appreciate your suggestions very much. I am new to bioinformatics. I can see now that bioconda makes it easier to install programs, kraken, but I still have to build/download the databases. Best regards, I. Vilchez

pomidorku commented 4 years ago

Sir,

I am running the roary step-by-step tutorial. I am stuck in the last part of the exercise. After running the following:

"select '>'|| cod || '|' || locus_sequence.locus || '|' || pangenoma.gene || x'0a' || sequence from locus_sequence inner join pangenoma_locus on locus_sequence.locus = pangenoma_locus.locus inner join pangenoma on pangenoma_locus.gene = pangenoma.gene inner join genomas_locus on locus_sequence.locus = genomas_locus.locus where pangenoma.gene = 'tetC';"

I do not see in the prompt any tectC gene output. I checked for fasta files named either "pangenoma.gene" or 'tetC, and I do not see any of them.

After running "roary -a" I can see that the only missing tool is kraken (although I installed kraken2).

Sqlite does not produce any warning. Can you help me with this issue?

Regards,

tseemann commented 4 years ago

@pomidorku you do NOT need to build the databases. they are already built. you just have to download the .tar.gz file, unzip, and set an environment variable. I'm sorry we can not make it any easier.

tseemann commented 4 years ago

@pomidorku that is a question for roary this github is for nullarbor. you can ask at https://github.com/sanger-pathogens/Roary/issues

pomidorku commented 4 years ago

Thank you for your advise.

Regarding the database for Kraken, perhaps build was not the right word. Thank you for pointing that out. I will run kraken2 on a linux laptop with 16GB ram, so I will have to get the minikraken database.

Thank you.

PS. I will post my question about roary in the appropriate github.

tseemann commented 4 years ago

Thank yoiu @pomidorku and keep asking questions and you will succeed in bioinformatics!