NUStatBioinfo / DegNorm

Normalizing RNA degradation in RNA-seq data
https://nustatbioinfo.github.io/DegNorm/
3 stars 1 forks source link

ValueError: Usecols do not match names. & ValueError: file {0} must have the 9 mandatory .gtf columns #28

Closed mhagemann86 closed 5 years ago

mhagemann86 commented 5 years ago

Hi,

I am having some issues running DegNorm. I hope you can help me out with what might be trivial issues. Every time I try and run the data I get the following issues: Traceback (most recent call last): File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/DegNorm-0.1.4-py3.6.egg/degnorm/loaders.py", line 134, in get_data , usecols=list(range(9))) File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/pandas/io/parsers.py", line 655, in parser_f return _read(filepath_or_buffer, kwds) File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/pandas/io/parsers.py", line 405, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/pandas/io/parsers.py", line 764, in init self._make_engine(self.engine) File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/pandas/io/parsers.py", line 985, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/pandas/io/parsers.py", line 1657, in init raise ValueError("Usecols do not match names.") ValueError: Usecols do not match names.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/bin/degnorm", line 11, in load_entry_point('DegNorm==0.1.4', 'console_scripts', 'degnorm')() File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/DegNorm-0.1.4-py3.6.egg/degnorm/main.py", line 85, in main exon_df = gap.run() File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/DegNorm-0.1.4-py3.6.egg/degnorm/gene_processing.py", line 107, in run exon_df = self.load() File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/DegNorm-0.1.4-py3.6.egg/degnorm/gene_processing.py", line 35, in load exon_df = self.loader.get_data() File "/mnt/kauffman/michaelhj/programs/anaconda3/envs/degnorm/lib/python3.6/site-packages/DegNorm-0.1.4-py3.6.egg/degnorm/loaders.py", line 136, in get_data raise ValueError('file {0} must have the 9 mandatory .gtf columns.' ValueError: file {0} must have the 9 mandatory .gtf columns.Read more at https://useast.ensembl.org/info/website/upload/gff.html

I have tried several gtf files, and made sure they have the correct format. Could there be something I am missing?

Thanks in advance for your help /Michael

ffineis commented 5 years ago

Hey Michael,

That error message wasn't formatted correctly - the "{0}" should be the filepath to the supplied .gtf file. As instructed by the remainder of the error message, if you visit https://useast.ensembl.org/info/website/upload/gff.html, look under the "Fields" section and you'll see the 9 requisite .gtf file fields.

Use head to ensure that your .gtf file doesn't have comments at the top (some .gff/.gtf converters add excess leading garbage to the top of their output). And there is an example of a typical .gtf file we use for degnorm testing available in tests/data/chr1_small.gtf.