nedialkova-lab / mim-tRNAseq

Modification-induced misincorporation tRNA sequencing
GNU General Public License v3.0
19 stars 14 forks source link

KeyError #34

Closed kopardev closed 2 years ago

kopardev commented 2 years ago

I have been getting the following errors for 3 different runs of mimseq with 3 different human datasets

+-----------------------------------+
| Final deconvolution and filtering |
+-----------------------------------+
2022-07-10 12:04:32,132 [INFO ] 234 clusters filtered out according to minimum coverage threshold: 0.05% of total tRNA coverage.
2022-07-10 12:05:19,876 [INFO ] Output final tables, counts and new mods...
2022-07-10 12:05:23,815 [INFO ] ** Read counts per anticodon saved to Tet2KOvsWT/counts/Anticodon_counts_raw.txt **
2022-07-10 12:05:23,815 [INFO ] ** Read counts per isodecoder saved to Tet2KOvsWT/counts/Isodecoder_counts_raw.txt **
2022-07-10 12:05:23,815 [INFO ] 11 sequences could not be deconvoluted as >10% parent assigned-reads do not match parent sequence!
2022-07-10 12:05:23,815 [INFO ] Reasons for this include misaligment of reads to an incorrect cluster, or inaccurate aligment by GSNAP (e.g. at indels in reads) prohibiting correct devonvolution.
2022-07-10 12:05:23,815 [INFO ] 79 total unique sequences not deconvoluted due to mismatches at modified sites, insufficient coverage or read mismatches to parent
Traceback (most recent call last):
  File "/opt2/conda/envs/mimseq/bin/mimseq", line 10, in <module>
    sys.exit(main())
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/mimseq.py", line 410, in main
    args.misinc_thresh, args.mito, args.pretrnas, args.local_mod, args.p_adj, args.sampledata)
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/mimseq.py", line 161, in mimseq
    isodecoder_sizes = writeIsodecoderInfo(out, name, isodecoder_sizes,readRef_unsplit_newNames, tRNA_dict)
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/splitClusters.py", line 477, in writeIsodecoderInfo
    del isodecoder_sizes[gene]
KeyError: 'Homo_sapiens_tRNA-Glu-TTC-1-1'
+-----------------------------------+
| Final deconvolution and filtering |
+-----------------------------------+
2022-07-10 12:04:37,098 [INFO ] 234 clusters filtered out according to minimum coverage threshold: 0.05% of total tRNA coverage.
2022-07-10 12:05:24,873 [INFO ] Output final tables, counts and new mods...
2022-07-10 12:05:28,852 [INFO ] ** Read counts per anticodon saved to Tet1n2DKOvsWT/counts/Anticodon_counts_raw.txt **
2022-07-10 12:05:28,852 [INFO ] ** Read counts per isodecoder saved to Tet1n2DKOvsWT/counts/Isodecoder_counts_raw.txt **
2022-07-10 12:05:28,852 [INFO ] 11 sequences could not be deconvoluted as >10% parent assigned-reads do not match parent sequence!
2022-07-10 12:05:28,852 [INFO ] Reasons for this include misaligment of reads to an incorrect cluster, or inaccurate aligment by GSNAP (e.g. at indels in reads) prohibiting correct devonvolution.
2022-07-10 12:05:28,852 [INFO ] 79 total unique sequences not deconvoluted due to mismatches at modified sites, insufficient coverage or read mismatches to parent
Traceback (most recent call last):
  File "/opt2/conda/envs/mimseq/bin/mimseq", line 10, in <module>
    sys.exit(main())
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/mimseq.py", line 410, in main
    args.misinc_thresh, args.mito, args.pretrnas, args.local_mod, args.p_adj, args.sampledata)
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/mimseq.py", line 161, in mimseq
    isodecoder_sizes = writeIsodecoderInfo(out, name, isodecoder_sizes,readRef_unsplit_newNames, tRNA_dict)
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/splitClusters.py", line 477, in writeIsodecoderInfo
    del isodecoder_sizes[gene]
KeyError: 'Homo_sapiens_tRNA-Val-CAC-1-1'
+-----------------------------------+
| Final deconvolution and filtering |
+-----------------------------------+
2022-07-10 12:04:59,814 [INFO ] 234 clusters filtered out according to minimum coverage threshold: 0.05% of total tRNA coverage.
2022-07-10 12:05:47,883 [INFO ] Output final tables, counts and new mods...
2022-07-10 12:05:51,824 [INFO ] ** Read counts per anticodon saved to Tet2n3DKOvsWT/counts/Anticodon_counts_raw.txt **
2022-07-10 12:05:51,824 [INFO ] ** Read counts per isodecoder saved to Tet2n3DKOvsWT/counts/Isodecoder_counts_raw.txt **
2022-07-10 12:05:51,824 [INFO ] 11 sequences could not be deconvoluted as >10% parent assigned-reads do not match parent sequence!
2022-07-10 12:05:51,824 [INFO ] Reasons for this include misaligment of reads to an incorrect cluster, or inaccurate aligment by GSNAP (e.g. at indels in reads) prohibiting correct devonvolution.
2022-07-10 12:05:51,824 [INFO ] 79 total unique sequences not deconvoluted due to mismatches at modified sites, insufficient coverage or read mismatches to parent
Traceback (most recent call last):
  File "/opt2/conda/envs/mimseq/bin/mimseq", line 10, in <module>
    sys.exit(main())
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/mimseq.py", line 410, in main
    args.misinc_thresh, args.mito, args.pretrnas, args.local_mod, args.p_adj, args.sampledata)
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/mimseq.py", line 161, in mimseq
    isodecoder_sizes = writeIsodecoderInfo(out, name, isodecoder_sizes,readRef_unsplit_newNames, tRNA_dict)
  File "/opt2/conda/envs/mimseq/lib/python3.7/site-packages/mimseq/splitClusters.py", line 477, in writeIsodecoderInfo
    del isodecoder_sizes[gene]
KeyError: 'Homo_sapiens_tRNA-Val-CAC-1-1'

Any recommendations on how I can get around this error?

drewjbeh commented 2 years ago

Hi

We actually noticed this error recently as well but for a different genome. Its interesting that this is happening for human, but hopefully the fix we introduced helps you too. I have pushed some new changes to the Github repo, specifically to splitClusters.py over here. Try pulling the repo and running this version, or just replace the splitClusters.py file with the one from the repo.

Hope that helps.

kopardev commented 2 years ago

Thanks for your quick response, @drewjbeh. I will try the fix and report back.