google / deepconsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
BSD 3-Clause "New" or "Revised" License
229 stars 36 forks source link

Advice for dealing with barcode multiplexing #32

Closed peterdfields closed 2 years ago

peterdfields commented 2 years ago

Hi,

I'd like to use deepconsensus on a set of HiFi datasets that were generated on a Sequel II but with barcoded multiplexing. My understanding is that without running the CCS step first detecting barcodes can be difficult. Do you have a recommended pipeline for going from a barcoded subread bam file in order to generate the subset of necessary inputs for CCS/actc/deepconsensus? Thank you for your time and advice.

kishwarshafin commented 2 years ago

Hi @peterdfields ,

Given the fact that the de-multiplexing algorithm is designed to work with CCS reads and currently we don't have any benchmarking available to provide you more context if there would be any contamination introduced due to the use of DeepConsensus, our recommendation would be to first de-multiplex the CCS reads with Lima and then apply DeepConsensus to the de-multiplexed reads independently.

armintoepfer commented 2 years ago

Linking the duplicate: https://github.com/PacificBiosciences/pbbioconda/issues/523

peterdfields commented 2 years ago

Thank you @armintoepfer