Closed cxue closed 6 years ago
Hi, Cheng!
It looks like your GTF file is space-delimited, not tab-delimited. Is this true? If so, that would result in that error. Was the GTF file downloaded directly from GENCODE or from another source?
Thanks!
Andy
Hi, Andy, The GTF is tab-delimited. Please see the following command: [cxue@smvxu TSScall-master]$ awk -F "\t" '{print $1"\t"$4;}' gencode.vM15.annotation.gtf|more chr1 3073253 chr1 3073253 chr1 3073253 chr1 3102016 chr1 3102016
The gencode.vM15.annotation.gtf is download from https://www.gencodegenes.org/mouse_releases/. And I also checked this with ensemble gtf (http://aug2017.archive.ensembl.org/info/data/ftp/index.html), I got the same message.
thanks Cheng
Hi, Andy,
When I removed the last several lines in gtf files, I got the following error messages:
Reading in bedGraph files...
Calculating read threshold...
Read threshold set to 3
Reading in annotation file...
Traceback (most recent call last):
File "TSScall.py", line 977, in
thanks
Cheng
Hi Andy, I see what's wrong. You did not provide some checks if there is no " in the field, in such case of "level 2;". And also you should set some default values to some important keys, such as transcript_id. Otherwise, the user will not run successfully.
best
Cheng
Hi, Cheng!
I fixed the code so it should now work with your GTF file. Please run using the version from the latest commit.
Thanks!
Andy
Hi, Lavenderca, I like this program. I have a question when I use it. I use GENCODE M15 as annotation reference file in format gtf: the format is: chr1 HAVANA gene 3073253 3074322 . + . gene_id "ENSMUSG000001026 93.1"; gene_type "TEC"; gene_name "4933401J01Rik"; level 2; havana_gene "OTTMUSG000000499 35.1"; chr1 HAVANA transcript 3073253 3074322 . + . gene_id "ENSMUSG0 0000102693.1"; transcript_id "ENSMUST00000193812.1"; gene_type "TEC"; gene_name "4933401J 01Rik"; transcript_type "TEC"; transcript_name "4933401J01Rik-201"; level 2; transcript_s upport_level "NA"; tag "basic"; havana_gene "OTTMUSG00000049935.1"; havana_transcript "OT TMUST00000127109.1";
But when I run TSScall (my command is: python TSScall.py -a gencode.vM15.annotation.gtf data.forward.mm10.bed data.reverse.mm10.bed mm10_chrom.sizes.txt TSS.annotated.bed), I got the error message: Reading in bedGraph files... Calculating read threshold... Read threshold set to 3 Reading in annotation file... Traceback (most recent call last): File "TSScall.py", line 977, in
TSSCalling(**vars(args))
File "TSScall.py", line 182, in init
self.execute()
File "TSScall.py", line 904, in execute
readInReferenceAnnotation(self.annotation_file)
File "TSScall.py", line 72, in readInReferenceAnnotation
attributes = line.strip().split('\t')
ValueError: need more than 1 value to unpack
Could you help me? Thanks
Cheng Xue