wwood / galah

More scalable dereplication for metagenome assembled genomes
GNU General Public License v3.0
48 stars 11 forks source link

If input genomes are short, they don't cluster #18

Open wwood opened 2 years ago

wwood commented 2 years ago

Issue is that --fragment-length is 3k by default, so if genomes (e.g. single contig phage genomes) are too short then they silently don't map. Maybe scan through the genomes to determine the shortest genome / contig, and automatically change to 500bp or something when input is short.