ShunOuchi / GreenHill

De novo chromosome-level scaffolding and phasing tool using Hi-C
GNU General Public License v3.0
25 stars 2 forks source link

About naming of haplotypes. #36

Open Isoris opened 1 month ago

Isoris commented 1 month ago

Dear @ShunOuchi,

I hope this message finds you well. I have a question regarding the GreenHill naming conventions.

In GreenHill, haplotypes are labeled as "hap0" and "hap1," whereas in tools like Hifiasm, they are labeled as "haplotype 1" and "haplotype 2." Would it be more consistent or make sense to follow the same numbering convention as Hifiasm and use "haplotype 1" and "haplotype 2" in the GreenHill output instead of "hap0" and "hap1"? I find the differences a bit confusing, particularly when using both tools, as we generate .hic.hap1 and .hic.hap2 with Hifiasm, but end up with _hap0 and _hap1 in GreenHill.

Thank you very much for your time and assistance.

Best regards, Quentin.

ShunOuchi commented 1 month ago

Hello,

Names of haplotypes vary between assembling/phasing tools. For example, the haplotype names of FALCON-Phase outputs are 0, 1 (phased.0.fasta and phased1.fasta) unlike hifiasm .

However, since hifiasm is used more often, it might be easier to understand if we use the same numbering system as hifiasm (i.e. hap1, hap2).

Thanks for your advice.