alexdobin / STAR

RNA-seq aligner
MIT License
1.87k stars 506 forks source link

FATAL GENOME INDEX FILE ERROR: transcriptInfo.tab is corrupt, or is incompatible with the current STAR version #2171

Open alexdhill opened 4 months ago

alexdhill commented 4 months ago

I've found a thread on this from 2020/2021 that seems to indicate that this is the result of an incomplete index generation, or mismatching chromosomes in the transcriptome annotation/genome reference, but that is not the case in my data. The indexing and attempted alignment were both done with the same version (2.7.11a).

STAR error log:

STAR --genomeDir reference.tmp/T2Tv2_index_v2.7.11a.star --readFilesIn read1.fq.gz read2.fq.gz --quantMode TranscriptomeSAM --outFileNamePrefix sample_name. --outSAMtype BAM SortedByCoordinate --outSAMunmapped Within --runThreadN 1
        STAR version: 2.7.11a   compiled: 2023-08-15T11:38:34-04:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
Jul 10 18:38:11 ..... started STAR run
Jul 10 18:38:12 ..... loading genome

EXITING because of FATAL GENOME INDEX FILE error: transcriptInfo.tab is corrupt, or is incompatible with the current STAR version
SOLUTION: re-generate genome index
Jul 10 18:39:47 ...... FATAL ERROR, exiting
Command exited with non-zero status 105

res memory (KB)     46868680
time (HH:mm:ss)     1:40.57

Genome indexing log:

STAR --runMode genomeGenerate --genomeDir T2Tv2_index_v2.7.11a.star --genomeFastaFiles genome.fa --sjdbGTFfile annotation.gtf --sjdbOverhang 149 --limitSjdbInsertNsj 6000000 --runThreadN 1
        STAR version: 2.7.11a   compiled: 2023-08-15T11:38:34-04:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
Jul 09 23:12:30 ..... started STAR run
Jul 09 23:12:30 ... starting to generate Genome files
Jul 09 23:13:44 ..... processing annotations GTF
Jul 09 23:16:27 ... starting to sort Suffix Array. This may take a long time...
Jul 09 23:17:27 ... sorting Suffix Array chunks and saving them to disk...
Jul 10 06:47:04 ... loading chunks from disk, packing SA...
Jul 10 06:50:23 ... finished generating suffix array
Jul 10 06:50:23 ... generating Suffix Array index
Jul 10 06:57:38 ... completed Suffix Array index
Jul 10 06:57:55 ..... inserting junctions into the genome indices
Jul 10 17:37:09 ... writing Genome to disk ...
Jul 10 17:37:15 ... writing Suffix Array to disk ...
Jul 10 17:38:04 ... writing SAindex to disk
Jul 10 17:38:11 ..... finished successfully

res memory (KB)     152013596
time (HH:mm:ss)     18:26:39

Chromosomes listed in annotation:

chr1
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr2
chr20
chr21
chr22
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chrM
chrX
chrY

Chromosomes listed in reference genome:

>chr1
>chr2
>chr3
>chr4
>chr5
>chr6
>chr7
>chr8
>chr9
>chr10
>chr11
>chr12
>chr13
>chr14
>chr15
>chr16
>chr17
>chr18
>chr19
>chr20
>chr21
>chr22
>chrX
>chrY
>chrM
alexdhill commented 4 months ago

Update: this error persists when the GTF is provided during alignment:

STAR --genomeDir ../../reference.tmp/T2Tv2_index_v2.7.11a.star --readFilesIn read1.fastq.gz read2.fastq.gz --sjdbGTFfile annotation.gtf --limitSjdbInsertNsj 6000000 --quantMode TranscriptomeSAM --outFileNamePrefix sample. --outSAMtype BAM SortedByCoordinate --runThreadN 8
        STAR version: 2.7.11a   compiled: 2023-09-15T03:04:06+0000 :/opt/conda/conda-bld/star_1694746407721/work/source
Jul 11 11:33:41 ..... started STAR run
Jul 11 11:33:41 ..... loading genome
Jul 11 11:36:25 ..... processing annotations GTF
Jul 11 11:40:55 ..... inserting junctions into the genome indices

EXITING because of FATAL GENOME INDEX FILE error: transcriptInfo.tab is corrupt, or is incompatible with the current STAR version
SOLUTION: re-generate genome index
Jul 11 12:00:20 ...... FATAL ERROR, exiting