tolkit / gfatk

A plant organellar Graphical Fragment Assembly toolkit
MIT License
16 stars 2 forks source link

gfatk linear usage questions #13

Closed Orz-CQ closed 2 years ago

Orz-CQ commented 2 years ago

Hi, Thanks for your user-friendly software! I am looking for a way to extract a longest path in a GFA file. But while I try gfatk linear xxx.gfa There is an error Error: Edge coverage not found.

Could I achieve my goal without the coverage?

Best wishes! Lan

Euphrasiologist commented 2 years ago

Hi Lan,

You could try to hack the edge coverage at the moment by specifying them all to be zero. This would be done by adding:

ec:i:0

To each link in the GFA (with a tab to separate from previous tags/link data on the line).

I can see if I can add this in as an option by default though. Out of interest, which software generated your GFA?

Best, M

Orz-CQ commented 2 years ago

Hi,

Thanks for your suggestion! I added the ec:i:0 to the GFA file, finally I could get results. But [-] The input GFA has multiple subgraphs (2). [-] Detected 16051 nodes in a subgraph. Skipping. , and the result only return the longest path with CCTGAA.

This GFA file was a pangenome of mitochondrion, and generated by seqwish. It's graph looks like a circle. Could you give me some advices to handle it to find the longest path in this GFA?

Best wishes, Lan

Euphrasiologist commented 2 years ago

Hi Lan,

This is probably because it would be impossible to calculate the longest path through that graph with 16,000 + nodes using the algorithm currently implemented unfortunately. This is because the underlying algorithm is doing an all-vs-all node path calculation. You would need far fewer nodes for the calculation to work ( < 100). If you could somehow reduce the node number, you may be able to calculate a longest path.

The software is mainly intended to work for smallish plant mitochondrial GFA's generated from MBG, but I'm glad you tried it out and I hope some of the other subcommands were useful :)

Sorry to be the bearer of bad news! M

Orz-CQ commented 2 years ago

Thanks for your user-friendly software. Although, I cannot use it this time, I will try gfatk in my other project. Again, many thanks for your kindness suggestions and help!