This PR starts to address #5 for processing QCed reads and assemblies with sourmash to ultimately:
1) Compare reads-v-reads, assembs-v-assembs, and reads-v-assembs
2) Get taxonomy breakdown of reads and assembs
This PR sets up the infrastructure with a sourmash_profiling subworkflow that can take it either reads or assemblies and runs through sketch and compare. Later PRs will add support for gather, taxonomy and comparing reads v assembs. I wanted to have this reviewed first to make sure everything is fine with the logic of how I've set this up.
The sourmash modules are in modules/local/nf-core-modified since I modified the version number of sourmash and include a seqtype for distinguishing between reads and assembs input so that carries through to the outfile file names.
This also includes some fixes in the nanopore workflow for making the channel names more clear, which came up in #45
I also would appreciate feedback on kmer size selection, for now I have it set to k21 but would like to parameterize this and have a sensible default @taylorreiter
This PR starts to address #5 for processing QCed reads and assemblies with sourmash to ultimately: 1) Compare reads-v-reads, assembs-v-assembs, and reads-v-assembs 2) Get taxonomy breakdown of reads and assembs
This PR sets up the infrastructure with a
sourmash_profiling
subworkflow that can take it either reads or assemblies and runs throughsketch
andcompare
. Later PRs will add support forgather
,taxonomy
and comparing reads v assembs. I wanted to have this reviewed first to make sure everything is fine with the logic of how I've set this up.The
sourmash
modules are inmodules/local/nf-core-modified
since I modified the version number of sourmash and include aseqtype
for distinguishing between reads and assembs input so that carries through to the outfile file names.This also includes some fixes in the nanopore workflow for making the channel names more clear, which came up in #45