PacificBiosciences / FALCON

FALCON: experimental PacBio diploid assembler -- Out-of-date -- Please use a binary release: https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
Other
205 stars 102 forks source link

Regarding Filtering Synthetic construct PacBio #659

Closed harsh-shukla closed 6 years ago

harsh-shukla commented 6 years ago

Hi all,

I am trying to assemble a mammalian genome and I found that some of my reads are mapping to Synthetic construct PacBio DNA sequence.

I will be using FALCON to assemble . My question is do I have to remove these sequences before I give to falcon for assembly or does FALCON pipeline takes care of it further downstream . And does the pipeline use these sequences for internal quality purposes and I should not be removing them at all.

Any help will be highly appreciated.

Thanking You,

Your's Sincerely, Harsh

mseetin commented 6 years ago

Hi Harsh,

If your length_cutoff (whether you manually set it or if you have it automatically computed) is above 2 kb, the length of the synthetic construct, then it won't appear in your final assembly. If you followed our usual recommendations and have 50 or more fold coverage of sequencing from a long insert, size-selected library, and you aim to use a seed coverage of 30-fold, this should almost always be the case. If you're starved for coverage and/or length, then maybe it'll show up as a primary contig, but you can just delete it from your fasta at that point prior to final polishing.

Matthew Seetin Pacific Biosciences

On Thu, Jul 26, 2018 at 4:03 AM Harsh Shukla notifications@github.com wrote:

Hi all,

I am trying to assemble a mammalian genome and I found that some of my reads are mapping to Synthetic construct PacBio DNA sequence.

I will be using FALCON to assemble . My question is do I have to remove these sequences before I give to falcon for assembly or does FALCON pipeline takes care of it further downstream . And does the pipeline use these sequences for internal quality purposes and I should not be removing them at all.

Any help will be highly appreciated.

Thanking You,

Your's Sincerely, Harsh

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/PacificBiosciences/FALCON/issues/659, or mute the thread https://github.com/notifications/unsubscribe-auth/AJvPPkcMaYa4bTnzrbFZeYnnxV3zgCvTks5uKaHpgaJpZM4Vhnes .

harsh-shukla commented 6 years ago

Hi Matthew ,

Thank you so much for the quick reply. I have a much better idea as to how to proceed further.

Regards, Harsh