velocyto-team / velocyto.py

RNA velocity estimation in Python
http://velocyto.org/velocyto.py/
BSD 2-Clause "Simplified" License
160 stars 83 forks source link

velocyte `run10x` error: Genome annotation gtf file is not sorted correctly! #331

Open jiangpuxuan opened 2 years ago

jiangpuxuan commented 2 years ago

I use my customized .gtf file to run 10x scRNA pipeline and everything is OK. However, when it came to velocyte run10x, here is an error:

OSError: Genome annotation gtf file is not sorted correctly! Run the following command:
sort -k1,1 -k7,7 -k4,4n -o [GTF_OUTFILE] [GTF_INFILE]

I ran the sort -k1,1 -k7,7 -k4,4n -o [GTF_OUTFILE] [GTF_INFILE] but it failed again. I did not use the -m parameter, so the format of mask.gtf did nothing with this run10x problem.

Here is head my.gtf:

"1"     "ensembl"       "gene"  20431   58833   "."     "-"     "."     "gene_id IGDB00001; gene_name IGDB00001"
"1"     "ensembl"       "gene"  92003   119620  "."     "-"     "."     "gene_id IGDB00002; gene_name IGDB00002"
"1"     "ensembl"       "gene"  103603  105216  "."     "+"     "."     "gene_id IGDB00003; gene_name IGDB00003"
"1"     "ensembl"       "gene"  121139  134255  "."     "-"     "."     "gene_id IGDB00004; gene_name IGDB00004"
"1"     "ensembl"       "gene"  121795  124788  "."     "+"     "."     "gene_id IGDB00005; gene_name IGDB00005"
"1"     "ensembl"       "gene"  152322  157371  "."     "+"     "."     "gene_id IGDB00006; gene_name IGDB00006"
"1"     "ensembl"       "gene"  153964  160887  "."     "-"     "."     "gene_id IGDB00007; gene_name IGDB00007"
"1"     "ensembl"       "gene"  202049  207179  "."     "+"     "."     "gene_id IGDB00008; gene_name IGDB00008"
"1"     "ensembl"       "gene"  203029  221904  "."     "-"     "."     "gene_id IGDB00009; gene_name IGDB00009"
"1"     "ensembl"       "gene"  278769  280886  "."     "+"     "."     "gene_id IGDB00010; gene_name IGDB00010"

Here is my.gtf after sort -k1,1 -k7,7 -k4,4n -o [GTF_OUTFILE] [GTF_INFILE]:

"000062F"       "StringTie"     "exon"  8693    11960   "1000.00"       "+"     "."     "gene_id 10297.1; transcript_id IGDB10297.1"
"000062F"       "StringTie"     "gene"  8693    11960   "1000.00"       "+"     "."     "gene_id IGDB10297; gene_name IGDB10297"
"000062F"       "StringTie"     "transcript"    8693    11960   "1000.00"       "+"     "."     "gene_id IGDB10297; transcript_id IGDB10297.1; gene_id IGDB10297; transcript_id IGDB10297.1"
"000062F"       "StringTie"     "exon"  1468    2226    "1000.00"       "-"     "."     "gene_id 10295.1; transcript_id IGDB10295.1"
"000062F"       "StringTie"     "gene"  1468    2226    "1000.00"       "-"     "."     "gene_id IGDB10295; gene_name IGDB10295"
"000062F"       "StringTie"     "transcript"    1468    2226    "1000.00"       "-"     "."     "gene_id IGDB10295; transcript_id IGDB10295.1; gene_id IGDB10295; transcript_id IGDB10295.1"
"000062F"       "StringTie"     "exon"  7600    7807    "1000.00"       "-"     "."     "gene_id 10296.1; transcript_id IGDB10296.1"
"000062F"       "StringTie"     "gene"  7600    7807    "1000.00"       "-"     "."     "gene_id IGDB10296; gene_name IGDB10296"
"000062F"       "StringTie"     "transcript"    7600    7807    "1000.00"       "-"     "."     "gene_id IGDB10296; transcript_id IGDB10296.1; gene_id IGDB10296; transcript_id IGDB10296.1"
"000076F"       "StringTie"     "exon"  15532   15624   "1000.00"       "+"     "."     "gene_id 10298.1; transcript_id IGDB10298.1"
yah-ox commented 2 years ago

Hi, I'm having the same issue. My gtf file was said to be not sorted correctly and the same code was provided to be run. After running the "sort -k1,1 -k7,7 -k4,4n -o [GTF_OUTFILE] [GTF_INFILE]", the new gtf was used to run but the same error information showed again. May i know have you figured it out?

Best, yh

jiangpuxuan commented 2 years ago

Hi, I'm having the same issue. My gtf file was said to be not sorted correctly and the same code was provided to be run. After running the "sort -k1,1 -k7,7 -k4,4n -o [GTF_OUTFILE] [GTF_INFILE]", the new gtf was used to run but the same error information showed again. May i know have you figured it out?

Best, yh

Sorry,I have not solved it......