NUStatBioinfo / DegNorm

Normalizing RNA degradation in RNA-seq data
https://nustatbioinfo.github.io/DegNorm/
3 stars 1 forks source link

IndexError: list index out of range (DegNorm) #36

Closed miyakokodama closed 4 years ago

miyakokodama commented 4 years ago

Hi

I am currently running DegNorm (version 0.1.4) on .bam files that have been aligned to a genome using STAR (I have 414 samples).

I am getting the following error message:

DegNorm (12/03/2019 12:53:35) ---- creating index file for /home/projects/S_373_Aligned_REAL.sortedByCoord.out.bam -- 414 / 414 DegNorm (12/03/2019 01:10:14) ---- SAMPLE S_373_Aligned_REAL.sortedByCoord.out -- sample contains single-end reads DegNorm (12/03/2019 01:10:15) ---- Begin genome annotation file processing... DegNorm (12/03/2019 01:10:15) ---- Loading genome annotation file /home/projects/GCF_000233375.1_ICSASG_v2_genomic.gtf... Traceback (most recent call last): File "/services/tools/anaconda3/4.0.0/bin/degnorm", line 8, in sys.exit(main()) File "/services/tools/anaconda3/4.0.0/lib/python3.5/site-packages/degnorm/main.py", line 84, in main exon_df = gap.run() File "/services/tools/anaconda3/4.0.0/lib/python3.5/site-packages/degnorm/gene_processing.py", line 107, in run exon_df = self.load() File "/services/tools/anaconda3/4.0.0/lib/python3.5/site-packages/degnorm/gene_processing.py", line 35, in load exon_df = self.loader.get_data() File "/services/tools/anaconda3/4.0.0/lib/python3.5/site-packages/degnorm/loaders.py", line 147, in get_data df['gene'] = df.attribute.apply(lambda x: self._attribute_to_gene(x, exprs=find_me)) File "/services/tools/anaconda3/4.0.0/lib/python3.5/site-packages/pandas/core/series.py", line 4045, in apply mapped = lib.map_infer(values, f, convert=convert_dtype) File "pandas/_libs/lib.pyx", line 2228, in pandas._libs.lib.map_infer File "/services/tools/anaconda3/4.0.0/lib/python3.5/site-packages/degnorm/loaders.py", line 147, in df['gene'] = df.attribute.apply(lambda x: self._attribute_to_gene(x, exprs=find_me)) File "/services/tools/anaconda3/4.0.0/lib/python3.5/site-packages/degnorm/loaders.py", line 106, in _attribute_to_gene gene_matches = list(filter(exprs[expr_idx].match, splt)) IndexError: list index out of range

Is this an issue with my .gtf file, or .bam file containing single-end reads?

Any help you could provide would be greatly appreciated.

Thanks! Miyako

ffineis commented 4 years ago

Hi Miyako, apologies for the delay. I'll get to this over the weekend.

ffineis commented 4 years ago

Hey Miyako,

Thanks for using DegNorm. Looks like an issue with your .gtf file. Please make sure that the attribute field of the .gtf file contains either a gene_id or gene_name attribute (see example). If experiencing further issues, please post the first few lines of your .gtf.

miyakokodama commented 4 years ago

Hi, thanks for your reply! Found out that there were a few rows with just transcript_id but not gene_id and gene_name in the attribute. Supplied these as unknown, degnorm seems to be just running fine now. Thanks so much for your help!