CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
481 stars 190 forks source link

KeyError: "tag 'NH' not present" #559

Closed shah0829 closed 1 year ago

shah0829 commented 2 years ago

I am trying to run velocyto (CLI) on bam files to generate loom files with spliced and unspliced information. i am using the following cmd

velocyto run-smartseq2 -o OUTPUT -e *sorted.bam mm10.gtf I am running into this problem can anyone help me with and guide me what to do....?

2022-09-23 18:28:46,737 - DEBUG - Mask available for chromosomes : []
2022-09-23 18:28:46,737 - DEBUG - Summarizing the results of intron validation.
2022-09-23 18:28:46,813 - DEBUG - Validated 0 introns (of which unique intervals 0) out of 382940 total possible introns (considering each possible transcript models).
2022-09-23 18:28:46,813 - DEBUG - Reading /home/.../WTCHG_946836_70015073.bam
2022-09-23 18:28:46,818 - DEBUG - Read first 0 million reads
Traceback (most recent call last):
  File "/home/sali/.local/bin/velocyto", line 8, in <module>
    sys.exit(cli())
  File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/sali/.local/lib/python3.8/site-packages/velocyto/commands/run_smartseq2.py", line 70, in run_smartseq2
    return _run(bamfile=bamfiles, gtffile=gtffile, bcfile=None, outputfolder=outputfolder,
  File "/home/sali/.local/lib/python3.8/site-packages/velocyto/commands/_run.py", line 229, in _run
    results = exincounter.count(bamfile_cellsorted, multimap=multimap)  # NOTE: we would avoid some millions of if statements evaluations if we write two function count and count_with output
  File "/home/sali/.local/lib/python3.8/site-packages/velocyto/counter.py", line 756, in count
    for r in self.iter_alignments(bamfile, unique=not multimap):
  File "/home/sali/.local/lib/python3.8/site-packages/velocyto/counter.py", line 258, in iter_alignments
    if unique and read.get_tag("NH") != 1:
  File "pysam/libcalignedsegment.pyx", line 2503, in pysam.libcalignedsegment.AlignedSegment.get_tag
  File "pysam/libcalignedsegment.pyx", line 2542, in pysam.libcalignedsegment.AlignedSegment.get_tag
KeyError: "tag 'NH' not present"

THANKS IN ADVANCE...!!

IanSudbery commented 2 years ago

Hi. I'm pretty sure this is not anything to do with UMI-tools?

shah0829 commented 2 years ago

thanks @IanSudbery for replying.

I am not sure either (also don't thing this has anything to do with umi tools) because I received these bam files from our collaborator, whatever sequencing company has produced. and only have bam files. so no idea and i really appreciate any suggestion.