Adamtaranto / teloclip

A tool for the recovery of unassembled telomeres from soft-clipped read alignments.
Other
35 stars 4 forks source link

Feature: Align and Extend #16

Open Adamtaranto opened 10 months ago

Adamtaranto commented 10 months ago

Existing modules:

New module:

Tasks:

Adamtaranto commented 10 months ago

Proposed modules names:

teloclip filter teloclip extract teloclip extend

Adamtaranto commented 10 months ago

10 requests automation of the contig extension process. I think it is generally unwise to blindly accept overhang alignments are "real" without inspecting them first.

Need to balance convenience vs enabling errors.

Could provide tutorial on manual curation: Select the best overhang-read (i.e. confident anchor region, unique alignment, many reads agree) and then extend contig with teloclip extend.

Alternatively, could provide an extend-now-ask-questions-later option whereby we extend contigs using the longest available overhang and then suggest validation checks i.e. align all overhang-reads back to the extended contig and look for agreement between reads.

Adamtaranto commented 10 months ago

Add output option for extract to yield MAF or MSA of overhang-reads that can be viewed in terminal or externally.

Adamtaranto commented 10 months ago

Note: Log total bases extended and bases per contig end. Useful for reporting results.

Adamtaranto commented 10 months ago

Option: Output BED file coords for extended sequence segments for review.