How to classify hifi reads into haplotype genomes? minimap2 did not work well

DayTimeMouse commented 5 months ago

Hi,

I used HiFi + ONT + HiC reads to assemble haplotype genomes, then I want to classfiy hifi reads into haplotype genome separately.

I have tried to use minimap2, due to the similar haplotype genomes, it did not work well.

I cat hap1 and hap2 into hap1_hap2.fa, use "minimap2(2.27-r1193) -c --secondary=no hap1_hap2.fa hifi.fastq.gz >aln.paf", then use reads ID from paf file to calculate the precision, only ~0.56. The HiFi reads were simulated, so I know the reads ID real belong to which haplotype.

So, can hifiasm do this? Or, do you have other method for this?

Best wishes.

chhylp123 commented 5 months ago

My understanding is that if your sample has long homozygous regions, minimap2 may not find the best alignment within these regions.

DayTimeMouse commented 4 months ago

Hi chhylp123,

I noticed that both hap1.gfa and hap2.gfa in the output of hifiasm contain HG:A:m and HG:A:p, why do they contain two kinds of tags? Another problem is that the haplotype tag in gfa only has some reads ids, so how do I get all the haplotype partition reads ids?

Looking forward to your reply.

chhylp123 commented 4 months ago

@DayTimeMouse Please see here: https://hifiasm.readthedocs.io/en/latest/interpreting-output.html#interpreting-output. Currently hifiasm cannot output the information of all reads.

chhylp123 / hifiasm

How to classify hifi reads into haplotype genomes? minimap2 did not work well #664