rrwick / Porechop

adapter trimmer for Oxford Nanopore reads
GNU General Public License v3.0
323 stars 124 forks source link

Chopping out vector backbones #21

Closed zachcp closed 7 years ago

zachcp commented 7 years ago

Hi @rrwick,

I was directed to Porechop by @skoren when asking him a question about the removal of vector sequences during long-read assembly. He suggested I look into Porechop. My problem is that I would like to assemble genomes where the genomes have been cloned/sequenced as pools of cosmid/fosmid/BACs and I want to remove the cloning vector sequence form the reads before assembly. ( I'm using PacBio). I will try Porechop but I am wondering if you have used Porechop in this way or if there any potential hiccups you would anticipate with this use case.

Thank you, zahc cp

rrwick commented 7 years ago

Hi zahc cp,

This may work, though you would need to alter the adapters.py file to include your custom sequences.

Where do you expect your cloning vector sequences to be in the reads? Porechop was specifically designed to look for adapters at the ends of reads, and at least initially, it only looks for them there. If your cloning vector sequences are distributed throughout the reads, it may not work as well.

However, if there are at least some sequences at the ends of reads (enough for Porechop to identify them as adapters to be removed), then the "Splitting reads containing middle adapters" step of the pipeline may do what you want. Only one way to find out - give it a try and let me know how you go!

Ryan

zachcp commented 7 years ago

Hi Ryan,

After a bit more thinking about the problem I ended up pre-filtering my reads by finding vector-homolous sequences with blasr and removing the hit regions. So I never really gave the porechop method a chance. I'll circle back i get some time to try it out.

BTW, Thanks for your awesome software, zach cp

rrwick commented 7 years ago

Sounds good. Something on my future-Porechop-features list has been an easier way to use custom adapters. Currently you have to edit the adapters.py file, but maybe a future version will have a friendlier way of doing it.