koadman / proxigenomics

Hi-C analysis of heterogeneous samples
GNU General Public License v2.0
5 stars 2 forks source link

Ground truth is only a guess #9

Open cerebis opened 10 years ago

cerebis commented 10 years ago

Our currently pipeline requires mapping contigs back to the reference genomes. For hard clustering we choose a winner based on alignment extent, which is ultimately a guess.

For communities with low phylogenetic distance, this guess is poor and therefore metrics which compare solutions against the ground truth are unreliable.

Soft-clustering is obviously the approach required.