CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
491 stars 190 forks source link

Error: UMI lengths are not the same! #515

Closed iammrtza closed 2 years ago

iammrtza commented 2 years ago

Hi dear,

I am getting the same error saying

Traceback (most recent call last): File "/Users/morteza/opt/anaconda3/bin/umi_tools", line 11, in sys.exit(main()) File "/Users/morteza/opt/anaconda3/lib/python3.8/site-packages/umi_tools/umi_tools.py", line 61, in main module.main(sys.argv) File "/Users/morteza/opt/anaconda3/lib/python3.8/site-packages/umi_tools/dedup.py", line 329, in main reads, umis, umi_counts = processor( File "/Users/morteza/opt/anaconda3/lib/python3.8/site-packages/umi_tools/network.py", line 419, in call clusters = self.UMIClusterer(counts, threshold) File "/Users/morteza/opt/anaconda3/lib/python3.8/site-packages/umi_tools/network.py", line 367, in call assert max(len_umis) == min(len_umis), ( AssertionError: not all umis are the same length(!): 4 - 12

my bam file looks like this (mapping is done by bowtie1): seq32063_TCCCCGCCC 0 hsa-let-7a-5p 1 255 22M 0 0 TGAGGTAGTAGGTTGTATAGTT IIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:22 NM:i:0 seq34741_GGCTTAAGCAC 0 hsa-let-7a-5p 1 255 20M 0 0 TGAGGTAGTAGGTTGTATAG IIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:20 NM:i:0 seq37476_ACGCCTCCGCCC 0 hsa-let-7a-5p 1 255 5M * 0 0 TGAGG IIIII XA:i:0 MD:Z:5 NM:i:0

Could you please help me with this? Thank you.

TomSmithCGAT commented 2 years ago

Hi @iammrtza. I'm closing this since you've already raised the same issue in #461 and I've started answering there.