fbreitwieser / krakenuniq

🐙 KrakenUniq: Metagenomics classifier with unique k-mer counting for more specific results
GNU General Public License v3.0
220 stars 44 forks source link

krakenuniq-build: essential output files #96

Open nick-youngblut opened 2 years ago

nick-youngblut commented 2 years ago

krakenuniq-build creates many output files, and it is unclear 1) whether the output files can differ, depending on the input (e.g., database_0 versus database_0 ... database_n) and 2) which of those files are actually needed for classification with krakenuniq. This info can be very helpful for integrating krakenuniq into pipeline software such as nextflow or snakemake.

nick-youngblut commented 2 years ago

In this regard, krakenuniq-build creates many files including a database.kdb.counts file, and the last line of a krakenuniq-build job states:

You can delete all files but database.{kdb,idx} and taxDB now, if you want

However, if one uses krakenuniq --report-file with a database directory only containing the database.{kdb,idx} files, then krakenuniq creates a new database.kdb.counts file:

Writing kmer counts to database.kdb.counts... [only once for this database, may take a while]

So, should the user actually keep database.{kdb,idx} and database.kdb.counts from a krakenuniq-build job?

alekseyzimin commented 2 years ago

Yes, please keep the counts file, I will update the message.