This is nice because you could build several different databases krakendb1, krakendb2... using the same base RefSeq download. I'm trying different combinations of additional genomes added. Unfortunatelty, the build script can't see the symbolically linked libraries. I think it is because of the find command called by build_kraken2_db doesn't traverse links: find library/ '(' -name '.fna' -o -name '.faa' ')' -print0
It works OK if you symbolically link to the entire library directory but that causes problems if you want to add genomes using kraken2-build --add-to-library
Adding -L to the find command could fix the issue but maybe it breaks things elsewhere?
I was trying to be clever and created a directory structure like:
krakendb1 |--library
Inside krakendb1/library I symbolically linked to downloaded and processed bacteria, fungi, human, etc...
krakendb1 |--library |--bacteria -> /somepath/bacteria |--fungi -> /somepath/fungi
This is nice because you could build several different databases krakendb1, krakendb2... using the same base RefSeq download. I'm trying different combinations of additional genomes added. Unfortunatelty, the build script can't see the symbolically linked libraries. I think it is because of the find command called by build_kraken2_db doesn't traverse links: find library/ '(' -name '.fna' -o -name '.faa' ')' -print0
It works OK if you symbolically link to the entire library directory but that causes problems if you want to add genomes using kraken2-build --add-to-library
Adding -L to the find command could fix the issue but maybe it breaks things elsewhere?