NBISweden / Earth-Biogenome-Project-pilot

Assembly and Annotation workflows for analysing data in the Earth Biogenome Project pilot project.
https://www.earthbiogenome.org/
GNU General Public License v3.0
10 stars 8 forks source link

Is it better to Kmer count the combined fastq or the parts? #84

Open mahesh-panchal opened 7 months ago

mahesh-panchal commented 7 months ago

Currently workflow does k-mer counting on the individual fastqs from each bam, but then goes on to combine the fastq. Should k-mer counting be performed on the combined fastq or the parts?

MartinPippel commented 6 months ago

I can check this out. The last time it was super buggy - and the step of producing symmetric kmers for the smudge plot always failed. However, if its working it should be much faster.