marbl / harvest

Other
51 stars 11 forks source link

Low cluster coverage #37

Open karchern opened 6 years ago

karchern commented 6 years ago

I'm trying to use parsnp to infer core genes (and importantly, get rid of signal introduced by recombination in core genes) on around 1000 bins retrieved from metagenomes. The problem is that parsnp throws an error reporting that cluster coverage is too low (below 1%)

I have run roary on the same set of genomes without any problems. Pairwise core gene distances are above 97%. When I lower the number of genomes to a smaller subset (e.g. 100 randomly selected genomes), parsnp works fine.

I am running parsnp like so:

./parsnp -r /path/to/ref_genome.fa -d genomes_folder -p 30 -x -c

Any suggestions would be very much appreciated.

A-BN commented 5 years ago

The culprit is probably your use of the -c option, forcing Parsnp to include all genomes found in -d.