Closed pansapiens closed 4 years ago
I do not control the bioconda package.
I will endeavour to add this to the manual to help others who encounter this problem!
FYI - I will be deprecating Kraken v1 support soon - Kraken v2 is ~30x more disk/RAM efficient.
Thanks Andrew!
The conda recipes are now fixed.
conda update --all
in your env!
Sir, I installed kraken2 (conda install kraken2), but I do not see the database. I would like to add some taxa to the database. Should I download the kraken2 database separately and update it?
You can download one of the pre-generated 'Minikraken' databases for Kraken2 from here: https://ccb.jhu.edu/software/kraken2/index.shtml?t=downloads
Specify the path to the database files using the --db
option, eg: kraken2 --db /path/to/untarred/minikraken2_v2_8GB_201904_UPDATE
.
You can alternatively set the database path using the environment variable export KRAKEN2_DEFAULT_DB =/path/to/untarred/minikraken2_v2_8GB_201904_UPDATE
(this is the way I do it when running nullarbor
).
To create your own database, consult the kraken2 manual for instructions: https://ccb.jhu.edu/software/kraken2/index.shtml?t=manual#custom-databases
@pomidorku there are instructions on how to download and setup all 3 databases: https://github.com/tseemann/nullarbor#databases
Thank you, I appreciate your suggestions very much. I am new to bioinformatics. I can see now that bioconda makes it easier to install programs, kraken, but I still have to build/download the databases. Best regards, I. Vilchez
Sir,
I am running the roary step-by-step tutorial. I am stuck in the last part of the exercise. After running the following:
"select '>'|| cod || '|' || locus_sequence.locus || '|' || pangenoma.gene || x'0a' || sequence from locus_sequence inner join pangenoma_locus on locus_sequence.locus = pangenoma_locus.locus inner join pangenoma on pangenoma_locus.gene = pangenoma.gene inner join genomas_locus on locus_sequence.locus = genomas_locus.locus where pangenoma.gene = 'tetC';"
I do not see in the prompt any tectC gene output. I checked for fasta files named either "pangenoma.gene" or 'tetC, and I do not see any of them.
After running "roary -a" I can see that the only missing tool is kraken (although I installed kraken2).
Sqlite does not produce any warning. Can you help me with this issue?
Regards,
@pomidorku you do NOT need to build the databases. they are already built. you just have to download the .tar.gz file, unzip, and set an environment variable. I'm sorry we can not make it any easier.
@pomidorku that is a question for roary
this github is for nullarbor
. you can ask at https://github.com/sanger-pathogens/Roary/issues
Thank you for your advise.
Regarding the database for Kraken, perhaps build was not the right word. Thank you for pointing that out. I will run kraken2 on a linux laptop with 16GB ram, so I will have to get the minikraken database.
Thank you.
PS. I will post my question about roary in the appropriate github.
Thank yoiu @pomidorku and keep asking questions and you will succeed in bioinformatics!
I've hit an issue with the current bioconda package - I'm seeing:
in the logs when nullarbor attempts to run
kraken
.I believe the issue could be that both the
kraken
andkraken2
packages are installed, and thekraken2
package might be clobbering theclassify
binary fromkraken
(v1.x).A solution is to do:
Force re-installing
kraken
seems to give aclassify
binary it's happy with.