google / deepconsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
BSD 3-Clause "New" or "Revised" License
222 stars 37 forks source link

Missing majority of ZMWs after running lima to search for adapters #49

Closed rosspdu closed 1 year ago

rosspdu commented 1 year ago

Hi, I'm not sure if you can help me on this but just want to raise this issue I've encountered after filtering adapters with lima. Not entirely sure why lima filtered out majority of the 'deepconsensus hifi reads'. Below is the script for the deepconsensus and lima:

DC

cmd4="module purge && module load deepconsensus/0.3.1 && deepconsensus run --checkpoint=/cluster/home/dc_model_0.3/checkpoint --ccs_bam=${outDir}/${outFilePrefix}.\${SLURM_ARRAY_TASK_ID}.bam --subreads_to_ccs=${outDir}/${outFilePrefix%.ccs}.subreads_to_ccs.\${SLURM_ARRAY_TASK_ID}.bam --output=${outDir}/${outFilePrefix%.ccs}.deepconsensus.\${SLURM_ARRAY_TASK_ID}.fastq --cpus ${THREAD}"

Lima

lima --num-threads 84 --split-bam-named --same --ccs ${ID}.deepconsensus.fastq /cluster/home/lima_pbmarkdup/pb_pcr_adapter.fa ${ID}.deepconsensus.lima.fastq

Here is the output summary from lima:

ZMWs input (A) : 1925775 ZMWs above all thresholds (B) : 13042 (0.68%) ZMWs below any threshold (C) : 1912733 (99.32%) ZMW marginals for (C): Below min length : 199 (0.01%) Below min score : 648496 (33.90%) Below min end score : 648496 (33.90%) Below min passes : 0 (0.00%) Below min score lead : 648496 (33.90%) Below min ref span : 1912733 (100.00%) Without SMRTbell adapter : 0 (0.00%) ZMWs for (B): With same pair : 13042 (100.00%) Coefficient of correlation : 0.00% ZMWs for (A): Allow diff pair : 1925775 (100.00%) Allow same pair : 1925775 (100.00%) Reads for (B): Above length : 13042 (100.00%) Below length : 0 (0.00%)

Thank you, I appreciate your help!

AndrewCarroll commented 1 year ago

Hi @rosspdu

Did you run DeepConsensus prior to lima or did you run lima first? If you ran DeepConsensus first, would you be able to reverse the order (run lima and then DeepConsensus) and let us know if this changes the amount of reads you see?

Thank you, Andrew

rosspdu commented 1 year ago

Hi @rosspdu

Did you run DeepConsensus prior to lima or did you run lima first? If you ran DeepConsensus first, would you be able to reverse the order (run lima and then DeepConsensus) and let us know if this changes the amount of reads you see?

Thank you, Andrew

Hello Andrew,

Thank you for your quick response. Yes, I ran the DeepConsensus first before lima. And yes, I can try reversing the order of run and will let update you.

Cheers, Ross

pichuan commented 1 year ago

Hi @rosspdu , Do you have any updates for us? If not, I'll plan to close this issue in the next few days. Thank you!

rosspdu commented 1 year ago

Dear Pi-Chuan,

Apologies for the late reply and for not providing updates on this. Yes, please close this issue. We figured it's 'running lima' that caused the missing ZMWs. Thanks again for your help.

Cheers, Ross

On Wed, Jan 4, 2023 at 8:28 PM Pi-Chuan Chang @.***> wrote:

Hi @rosspdu https://github.com/rosspdu , Do you have any updates for us? If not, I'll plan to close this issue in the next few days. Thank you!

— Reply to this email directly, view it on GitHub https://github.com/google/deepconsensus/issues/49#issuecomment-1371331606, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3U7MXLFN7G75KPPX3BP4GLWQXFOLANCNFSM6AAAAAASDGFXBA . You are receiving this because you were mentioned.Message ID: @.***>