chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
555 stars 88 forks source link

Which does GFA file contain sex chromosome derived sequences? #687

Open Jung19911124 opened 3 months ago

Jung19911124 commented 3 months ago

Hello.

I have asked a similar question before, but I am in a similar but different situation.

I have assembled the genome of a wild animal species (XY species) in Hifiasm. The coverage is about 34x, and I don't know which is the Y chromosome and which is the X chromosome because the genome of a closely related species is not read.
The coverage is supposed to be halved for the sex chromosomes, so I do not know how the output will look. I have three questions for the output files of Hifiasm.

I assembled with the default settings, and among the outputs, the resulting gfa files are:

First, is it possible that

harbor X-linked and Y-linked sequences, respectively? In my case, since "hifiasm_.bp.hap1.pctg.gfa" and "hifiasm.bp.hap2.p_ctg.gfa" have different sizes, I think it is possible.

My second question is if it is ok for us to assume that hifiasm_.bp.pctg.gfa is the logical sum of hifiasm.bp.hap1.pctg.gfa and hifiasm*.bp.hap2.p_ctg.gfa?

Finally, is it correct to assume that hifiasm_.bp.putg.gfa contains sequences (potential sex chromosome sequences) that are not included in hifiasm.bp.pctg.gfa, hifiasm.bp.hap1.pctg.gfa and hifiasm.bp.hap2.p_ctg.gfa?

Best, Jung