hsinnan75 / StrainPro

MIT License
7 stars 3 forks source link

option to specify path to tax dump files #2

Closed nick-youngblut closed 4 years ago

nick-youngblut commented 4 years ago

As far as I can tell, the path to the tax dump files download by download_taxonomy.sh is hardcoded into StrainPro-build. It would be helpful if the user can specify the location to the tax dump files so that custom taxonomies can be used (eg., GTDB tax dump files instead of NCBI) or if the user already has the NCBI tax dump files downloaded and located elsewhere (eg., in a "databases" directory).

hsinnan75 commented 4 years ago

Thanks for the suggestions. I'll make it happen.

hsinnan75 commented 4 years ago

@nick-youngblut I was wondering if GTDB tax dump files are organized with the same format of NCBI files. If yes, you may specify their path in a text file which looks like below and run StrainPro with "-dump " NodesDumpFilePath /path/nodes.dmp MergedDumpFilePath /path/merged.dmp

nick-youngblut commented 4 years ago

Yeah, the GTDB tax dump files are in the same format as the standard NCBI taxdump files. Thanks for quickly creating the -dump option! I'll give it a try.

btw, my simple code for creating the taxdump files from the GTDB taxonomy can be found at https://github.com/nick-youngblut/gtdb_to_taxdump