STOmics / SAW

GNU General Public License v3.0
119 stars 32 forks source link

SAW 7.0 - Can't generate genome index #77

Closed gringer closed 7 months ago

gringer commented 7 months ago

I can't get past the second step of building the genome index.

I've tried to reduce this down to the most simple approach - creating a genome index without GTF information in the current directory on a local file system, and it still doesn't work:

deccles@malaghan.org.nz@big-bird:/mnt/md0/deccles/STAR$ singularity exec /mnt/md0/deccles/singularityImages/SAW_7.0.sif mapping --runMode genomeGenerate --genomeDir /mnt/md0/deccles/STAR/GRCm38_SJ100 --genomeFastaFiles $(readlink -e ./Mus_musculus.GRCm38.dna.primary_assembly.fa) --runThreadN 12

--- cmd:  /opt/saw_v7.0.0_software/pipeline/mapping/bcSTAR --runMode genomeGenerate --genomeDir /mnt/md0/deccles/STAR/GRCm38_SJ100 --genomeFastaFiles /mnt/md0/deccles/STAR/Mus_musculus.GRCm38.dna.primary_assembly.fa --runThreadN 12

--- compile: Tue Jul 4 13:31:51 CST 2023
--- Info: v2.1.1 (based on STAR 2.7.2b) is released after optimization by BGI. - 
Nov 16 10:34:34 ..... started bcSTAR run
Nov 16 10:34:35 ... starting to generate Genome files

EXITING because of INPUT ERROR: could not open genomeFastaFile: /mnt/md0/deccles/STAR/Mus_musculus.GRCm38.dna.primary_assembly.fa

Nov 16 10:34:35 ...... FATAL ERROR, exiting
deccles@malaghan.org.nz@big-bird:/mnt/md0/deccles/STAR$ head -n 3 $(readlink -e ./Mus_musculus.GRCm38.dna.primary_assembly.fa)
>1 dna:chromosome chromosome:GRCm38:1:1:195471971:1 REF
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

What am I missing?

gringer commented 7 months ago

Ah, right. I was missing the singularity environment variable binds:

deccles@malaghan.org.nz@big-bird:/stornext/DataSAN/Bioinformatics/Databases$ referenceDir=$(readlink -e fasta/gencode_M33/)
deccles@malaghan.org.nz@big-bird:/stornext/DataSAN/Bioinformatics/Databases$ mkdir ${referenceDir}/gencode_M33_SJ100/
deccles@malaghan.org.nz@big-bird:/stornext/DataSAN/Bioinformatics/Databases$ export SINGULARITY_BIND=${referenceDir}
deccles@malaghan.org.nz@big-bird:/stornext/DataSAN/Bioinformatics/Databases$ singularity exec /mnt/md0/deccles/singularityImages/SAW_7.0.sif mapping --runMode genomeGenerate --genomeDir ${referenceDir}/gencode_M33_SJ100 --genomeFastaFiles ${referenceDir}/GRCm39.primary_assembly.genome.fa --sjdbGTFfile ${referenceDir}/gencode.vM33.basic.annotation.gtf --sjdbOverhang 99 --runThreadN 12

--- cmd:  /opt/saw_v7.0.0_software/pipeline/mapping/bcSTAR --runMode genomeGenerate --genomeDir /stornext/DataSAN/Bioinformatics/Databases/fasta/gencode_M33/gencode_M33_SJ100 --genomeFastaFiles /stornext/DataSAN/Bioinformatics/Databases/fasta/gencode_M33/GRCm39.primary_assembly.genome.fa --sjdbGTFfile /stornext/DataSAN/Bioinformatics/Databases/fasta/gencode_M33/gencode.vM33.basic.annotation.gtf --sjdbOverhang 99 --runThreadN 12

--- compile: Tue Jul 4 13:31:51 CST 2023
--- Info: v2.1.1 (based on STAR 2.7.2b) is released after optimization by BGI. - 
Nov 16 10:43:57 ..... started bcSTAR run
Nov 16 10:43:58 ... starting to generate Genome files
Nov 16 10:44:49 ..... processing annotations GTF
--- Info: start building ref
--- Info: time for building ref - 0.094514 min, memory size of ref - 5.19144 GB.
--- Info: start building SA
...