broadinstitute / longbow

Annotation and segmentation of MAS-seq data
https://broadinstitute.github.io/longbow/
BSD 3-Clause "New" or "Revised" License
20 stars 4 forks source link

random segment extraction no longer an option? #190

Open morellr opened 2 years ago

morellr commented 2 years ago

It appears that random segment extraction is no longer an option since the model is now required to have a named coding region, and extract defaults to the coding region even if leading_adapter and trailing_adapter are named. Is it possible to extract everything between a 5p_adapter and 3p_adapter, including the adapter? That way the resulting file, with UMI, CBC and polyA still in the read, could be fed into PacBio's Isoseq3 pipeline at "lima", or at "tag" with the adapters removed.

jonn-smith commented 1 year ago

You are correct. That's a good point - I hadn't thought of including a mode to make it compatible with PacBio tools. This is a good feature request.

It will probably be implemented as an additional flag in extract.

I can't promise any timeline, but it's now on my radar.