Closed carla-hazelf closed 1 month ago
Operating system PRETTY_NAME: "Rocky Linux 8.8 (Green Obsidian)" VERSION: "8.8 (Green Obsidian)"
Package name lima lima 1.11.0 (commit v1.11.0-1-gec618c9)
Conda environment What is the result of conda list? (Try to paste that between triple backticks.)
conda list
Describe the bug lima is removing too many ZMW's. I am using the standard primers;
NEB_5p GCAATGAAGTCGCAGGGTTGGG Clontech_5p AAGCAGTGGTATCAACGCAGAGTACATGGGG NEB_Clontech_3p GTACTCTGCGTTGATACCACTGCTT
With these primers, I get; ZMWs above all thresholds (B) : 452638 (23%) I'm unsure of whether this issue is due to; -> primer choice -> or an issue with the raw sequence data itself.
I looked at the represented sequences in fastqc;
Does this mean that the primers have not worked?
Error message `ZMWs input (A) : 1939872 ZMWs above all thresholds (B) : 452638 (23%) ZMWs below any threshold (C) : 1487234 (77%)
ZMW marginals for (C): Below min length : 24 (0%) Below min score : 0 (0%) Below min end score : 390827 (26%) Below min passes : 333 (0%) Below min score lead : 0 (0%) Below min ref span : 750683 (50%) Without SMRTbell adapter : 333 (0%) Undesired hybrids : 333 (0%) Undesired 5p--5p pairs : 713200 (48%) Undesired 3p--3p pairs : 667027 (45%) Undesired no hit : 333 (0%)
ZMWs for (B): With different pair : 452638 (100%) Coefficient of correlation : 0%
ZMWs for (A): Allow diff pair : 1939539 (100%) Allow same pair : 1939539 (100%)
Reads for (B): Above length : 452638 (100%) Below length : 0 (0%) `
To Reproduce lima tissue.ccs.bam primers.fa $dir/$name.$tissue1.fl.bam --isoseq --num-threads $threads
Expected behavior I expected to have more ZMWs above the threshold.
Please check your data manually if you find those primer combinations. It's HiFi data, so a grep works most of the times
Operating system PRETTY_NAME: "Rocky Linux 8.8 (Green Obsidian)" VERSION: "8.8 (Green Obsidian)"
Package name lima lima 1.11.0 (commit v1.11.0-1-gec618c9)
Conda environment What is the result of
conda list
? (Try to paste that between triple backticks.)Describe the bug lima is removing too many ZMW's. I am using the standard primers;
With these primers, I get; ZMWs above all thresholds (B) : 452638 (23%) I'm unsure of whether this issue is due to; -> primer choice -> or an issue with the raw sequence data itself.
I looked at the represented sequences in fastqc;
Does this mean that the primers have not worked?
Error message `ZMWs input (A) : 1939872 ZMWs above all thresholds (B) : 452638 (23%) ZMWs below any threshold (C) : 1487234 (77%)
ZMW marginals for (C): Below min length : 24 (0%) Below min score : 0 (0%) Below min end score : 390827 (26%) Below min passes : 333 (0%) Below min score lead : 0 (0%) Below min ref span : 750683 (50%) Without SMRTbell adapter : 333 (0%) Undesired hybrids : 333 (0%) Undesired 5p--5p pairs : 713200 (48%) Undesired 3p--3p pairs : 667027 (45%) Undesired no hit : 333 (0%)
ZMWs for (B): With different pair : 452638 (100%) Coefficient of correlation : 0%
ZMWs for (A): Allow diff pair : 1939539 (100%) Allow same pair : 1939539 (100%)
Reads for (B): Above length : 452638 (100%) Below length : 0 (0%) `
To Reproduce lima tissue.ccs.bam primers.fa $dir/$name.$tissue1.fl.bam --isoseq --num-threads $threads
Expected behavior I expected to have more ZMWs above the threshold.