lemene / PECAT

PECAT, a phased error correct and assembly tool
BSD 2-Clause "Simplified" License
38 stars 1 forks source link

Question: how about generating unitig format? #12

Closed Yutang-ETH closed 1 year ago

Yutang-ETH commented 1 year ago

Hi @lemene,

I hope you are well.

I have just tried PECAT with our ONT data from a plant genome, it worked like a charm, and I got the best primary assembly ever! Thank you very much for your brilliant work.

However, I am just wondering is it possible to output unitigs as Hifiasm does? I don't like neither the primary + alternate nor the dual format. What I want is something like the unitigs in one gfa/fasta which could be binned to different haplotypes later by other phasing methods, such as AllHic. Does this make sense to you?

I understand the motivation of PECAT is to generate phased assembly with only noisy long reads to reduce cost, however, in practice, Hi-C data are produced anyway for scaffolding, besides, without long-range phase information like Hi-C, the phase between contigs will not be determined. This is reflected by Figure 3 in your preprint where there are blue and red blobs in each haplotype. What do you think?

Look forward to your reply and please have a nice weekend.

Best wishes, Yutang

lemene commented 1 year ago

Hi @Yutang-ETH I'm very sorry for replying to you so late. Currently, pecat can only output two formats. I understand what you said, and I will add this feature as soon as possible.

Yutang-ETH commented 1 year ago

Hi @lemene,

Thank you very much for your kind reply. It is not late at all.

I think unitig output format would be a great feature to add, look forward to that.

Best wishes, Yutang