Open mmcguffi opened 1 year ago
The simple question first: contigs are never merged during polishing.
It is unlikely that we will ever implement GFA output. I'd have to refresh my memory of the details of GFA but I believe outputting an updated GFA would require recomputing connections and overlaps between the contigs. This is not a trivial operation when the contigs have changed length (as they do during polishing, and we wish to keep containments). If there is a library out there that implements such transformations (possibly something akin to liftover), then in might be possible to embed into medaka. Otherwise it would be a task of a standalone tool.
Hmmm, even in the case of simple links care would need to be taken in implementing this because medaka can arbitrarily extend contigs ends (not starts). This could have subtle and weird effects on the interpretation of the GFA.
Any chance you would implement medaka gfa polishing? This remains a hassle for several of our pipelines
Is your feature request related to a problem? Please describe.
medaka_consensus
acceptsfasta
files instead ofgfa
files (and outputsfasta
files). Often this connection information can be quite important for various downstream analyses.Describe the solution you'd like
gfa
file tomedaka_consensus
which gives a polishedgfa
output fileDescribe alternatives you've considered
gfa
file by using the node/edge information in the pre-polishedgfa
file. However, I have noticed that contigs are sometimes lost during polishing. Are contigs ever merged during polishing or are they only deleted? I think it should be possible to reconstruct a polishedgfa
if they are only deleted, though Im less sure of the feasibility of this if contigs are merged.Thanks for the great tool!