Open jmtsuji opened 1 year ago
@LeeBergstrand My apologies, I accidentally made my initial commit for this project to the develop branch at 81bb669! I've now made a separate branch, stitch_module
, for further development. The stitch.py
code is in a separate file from other code, so this accidental commit should have limited impact on the rest of the repo.
@jmtsuji Feel free to close if this has been addressed.
Still ongoing -- thanks for migrating this over to the new repo.
As mentioned in rotary-genomics/rotary-utils#8, the
merge
module from circlator (a very helpful package!) is currently used to guide the final step in the end repair workflow.circlator merge
uses nucmer (from the mummer package) to align a 'guide contig' (spanning the ends of the FastA entry for the circular contig) with the circular reference contig. Then,circlator merge
parses the nucmer hit table to find suitable matches of the two contigs. If the match stats pass required thresholds (e.g., minimum % ID and position relative to the ends of both contigs), then thecirclator merge
module stitches the guide contig into the circular contig at the sites determined by nucmer. This process ultimately allowscirclator merge
to repair mis-assembled regions around the ends (in the FastA file) of the circular contig.Unfortunately,
circlator merge
is no longer actively supported. It is also part of a larger package that includes many dependencies we don't need and uses approaches that aren't up-to-date with some of the rapid changes that have occured in the field of long read sequencing. Because of this, I'd like to dropcirclator merge
from the end repair code and replace it with an in-house alternative.Desired features of the in-house alternative:
circlator merge
). For example, it would be nice to know if there were any gaps or overlaps that were repaired, as well as a seprarate base-by-base summary of changescirclator merge
)