marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

reads overlap file #2303

Closed tianjio closed 7 months ago

tianjio commented 7 months ago

After assembly, if I want to view the overlap relationships between reads, not only the reads used for assembly. Which file should I look at? I found a best.edges.gfa file in unitigging/4-unitigger. Does it include all reads or only the reads used for assembly?

skoren commented 7 months ago

The best edges file will only include the edges being selected for assembly which are a subset of all the overlaps computed. Take a look at the ovStoreDump binary, with some info on it here: https://canu.readthedocs.io/en/stable/commands/ovStoreDump.html. You can also run it w/o options to get usage. It will print the overlaps with any filtering (error, ids, length, etc) in the store in various formats (including gaf) and can also annotate them with which were used for assembly and which were not