biocore / deblur

Deblur is a greedy deconvolution algorithm based on known read error profiles.
BSD 3-Clause "New" or "Revised" License
92 stars 41 forks source link

Is it possible to turn off chimera checking? #181

Open ghost opened 6 years ago

ghost commented 6 years ago

Going in a different direction from #165, is it possible to turn off chimera checking in deblur? I'd like to use the denoised data with alternate chimera checking methods.

wasade commented 6 years ago

It's not possible right now. @amnona, would this be easy to add?

amnona commented 6 years ago

Hi, I think you can perform your own chimera checking using the following:

  1. run deblur full pipeline with the --keep-tmp-files flag
  2. you will have in the deblur working directory several files for each sampleid. one of these files will be XXX..msa.deblur and another XXX,.msa.deblur.no_chimeras . These files correspond to the post-deblur step and the chimera removal after deblur. you can take the XXX..msa.deblur files (which are fasta files) and do your chimera removal algorithm, and then write them to a directory with filenames XXX.XXX..msa.deblur.no_chimeras.
  3. On this directory, you can run the final step of the deblur workflow (joining the fasta files to a single biom table) using: deblur build_biom_table (you can use deblur build_biom_table --help to see the relevant parameters).
  4. If you also want to remove the non-16S sequences from the resulting table, you can finally run this step in deblur as well using: deblur remove_artifacts

BTW: Just out of curiosity, why do you want to use a different chimera removal algorithm? Do you have specific examples where deblur chimera removal does not work well?

Good luck, and let me know if you have any questions/problems Amnon

On Tue, Sep 4, 2018 at 9:07 PM polypay123 notifications@github.com wrote:

Going in a different direction from #165 https://github.com/biocore/deblur/issues/165, is it possible to turn off chimera checking in deblur? I'd like to use the denoised data with alternate chimera checking methods.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/181, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8sNyasJEwDxy5uS4eUK0EuVSiqkRks5uXsFTgaJpZM4WZeXv .

ghost commented 6 years ago

Thanks, I'll use the --keep-tmp-files flag. Why use a different chimera checking algorithm? I prefer tools that don't package a bunch of things together as a black box. I want control over the individual steps. Considering you have "trained" deblur on 1x150nt sequences and I have 2x250 fully overlapping reads (ie. V4 region), I can imagine the desired settings for each step will be different.

amnona commented 6 years ago

Cool. Good luck. And please let us know if you encounter cases where the performance of the deblur chimera removal step could be improved :)

Thanks Amnon

On Thu, Sep 6, 2018 at 2:14 PM polypay123 notifications@github.com wrote:

Thanks, I'll use the --keep-tmp-files flag. Why use a different chimera checking algorithm? I prefer tools that don't package a bunch of things together as a black box. I want control over the individual steps. Considering you have "trained" deblur on 1x150nt sequences and I have 2x250 fully overlapping reads (ie. V4 region), I can imagine the desired settings for each step will be different.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/181#issuecomment-419056813, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8v9m97cO7VTyqNjUCLGsPLLxOg0gks5uYQOAgaJpZM4WZeXv .