The README says "the proportion shared between each contig with a female reference is computed."
Maybe I am wrong about the rest of this, but it seems like that contradicts what the code does.
In both classify_fm_male_mode() and classify_fm_mode() it looks like what is reported is (C-F) / C, where C is the number of kmers in the contig (with duplicates counted as often as they appear and all-N kmers not counted) and F is the number of kmers in the contig and also in the female reference.
So a proportion reported as 1.0 would mean none of the contig's kmers were found in the female reference. So that would be evidence that the contig is from something not found in female — presumably male specific.
A proportion reported as 0.0 would mean all of the contig's kmers were found in the female reference. Evidence that the contig is not male specific.
The README says "the proportion shared between each contig with a female reference is computed."
Maybe I am wrong about the rest of this, but it seems like that contradicts what the code does.
In both classify_fm_male_mode() and classify_fm_mode() it looks like what is reported is (C-F) / C, where C is the number of kmers in the contig (with duplicates counted as often as they appear and all-N kmers not counted) and F is the number of kmers in the contig and also in the female reference.
So a proportion reported as 1.0 would mean none of the contig's kmers were found in the female reference. So that would be evidence that the contig is from something not found in female — presumably male specific.
A proportion reported as 0.0 would mean all of the contig's kmers were found in the female reference. Evidence that the contig is not male specific.