TrinityCTAT / ctat-genome-lib-builder

Software used by Trinity CTAT for building CTAT Genome Libs, resource databases shared by Trinity CTAT components
BSD 3-Clause "New" or "Revised" License
10 stars 2 forks source link

ref_genome.fa.star.idx/genomeParameters.txt is missing #5

Open Benoitdw opened 2 years ago

Benoitdw commented 2 years ago

I'm trying to build the CTAT genome lib because the plug and play resources for Fv1.9 do not contain the genome lib build with gencode-v38 which is use during my pipeline. I tried to generate the library with : prep_genome_lib.pl --genome_fa /data/hg38.fasta --gtf /data/gtf/gencode.v38.annotation.gtf --dfam_db human --fusion_annot_lib /data/CTAT/StarFv1.9/CTAT_HumanFusionLib.mini.dat.gz --pfam_db current --human_gencode_filter

However the directory generated contains only 39 files (ls -R gencode_38 | wc -l) vs 68 for the plug and play one (gencode_v33).

When I try to launch fusion inspector, it leads to a failed during the STAR process because ref_genome.fa.star.idx/genomeParameters.txt is missing:

* Running CMD: /usr/local/bin/STAR  --runThreadN 5  --genomeDir /ref/CTAT/StarFv1.9/gencode_38/ref_genome.fa.star.idx  --outSAMtype BAM SortedByCoordinate  --twopassMode Basic  --alignSJDBoverhangMin 10  --genomeSuffixLengthMax 10000 --limitBAMsortRAM 49837599450  --alignInsertionFlush Right   --alignMatesGapMax 100000  --alignIntronMax 100000  --readFilesIn /data/BRAINSTORM1_KDM26344_OKDM_1298_20220112104053_R1.concat.fastq.gz /data/BRAINSTORM1_KDM26344_OKDM_1298_20220112104053_R2.concat.fastq.gz  --genomeFastaFiles /config/temp/fi_workdir/finspector.fa  --outSAMfilter KeepAllAddedReferences  --sjdbGTFfile /config/temp/fi_workdir/finspector.gtf  --alignSJstitchMismatchNmax 5 -1 5 5  --scoreGapNoncan -6  --readFilesCommand 'gunzip -c' 
Jan 14 11:05:59 ..... started STAR run
Jan 14 11:05:59 ..... loading genome

EXITING because of FATAL ERROR: could not open genome file /ref/CTAT/StarFv1.9/gencode_38/ref_genome.fa.star.idx/genomeParameters.txt
SOLUTION: check that the path to genome files, specified in --genomeDir is correct and the files are present, and have user read permsissions

There is my gencode_v38 directory generated :

 .
├──  __chkpts
│  ├──  makeblastdb.ok
│  ├──  ref_annot.gtf.mini.sortu.ok
│  ├──  ref_annot.gtf.ok
│  ├──  ref_genome.fa.ok
│  └──  ref_genome_fai.ok
├──  ctat_genome_lib_build_dir
│  ├──  __chkpts
│  ├──  ref_genome.fa
│  ├──  ref_genome.fa.fai
│  ├──  ref_genome.fa.nhr
│  ├──  ref_genome.fa.nin
│  ├──  ref_genome.fa.nsq
│  └──  ref_genome.fa.star.idx
├──  ref_annot.gtf
├──  ref_annot.gtf.mini.sortu
├──  ref_genome.fa
├──  ref_genome.fa.fai
├──  ref_genome.fa.nhr
├──  ref_genome.fa.nin
├──  ref_genome.fa.nsq
└──  ref_genome.fa.star.idx
   ├──  chrLength.txt
   ├──  chrName.txt
   ├──  chrNameLength.txt
   └──  chrStart.txt

Does anyone knows which command line was used to generate the plug and play genome lib?

brianjohnhaas commented 2 years ago

Hi

it doesn't look like you've got a full star genome index built here. Did the process crash due to out of memory error or other error during that part? You can try rerunning the prep script and it should pick up where it left off.

On Fri, Jan 14, 2022 at 8:28 AM Benoitdw @.***> wrote:

I'm trying to build the CTAT genome lib because the plug and play resources for Fv1.9 do not contain the genome lib build with gencode-v38 which is use during my pipeline. I tried to generate the library with : prep_genome_lib.pl --genome_fa /data/hg38.fasta --gtf /data/gtf/gencode.v38.annotation.gtf --dfam_db human --fusion_annot_lib /data/CTAT/StarFv1.9/CTAT_HumanFusionLib.mini.dat.gz --pfam_db current --human_gencode_filter

However the directory generated contains only 39 files (ls -R gencode_38 | wc -l) vs 68 for the plug and play one (gencode_v33).

When I try to launch fusion inspector, it leads to a failed during the STAR process because ref_genome.fa.star.idx/genomeParameters.txt is missing:

  • Running CMD: /usr/local/bin/STAR --runThreadN 5 --genomeDir /ref/CTAT/StarFv1.9/gencode_38/ref_genome.fa.star.idx --outSAMtype BAM SortedByCoordinate --twopassMode Basic --alignSJDBoverhangMin 10 --genomeSuffixLengthMax 10000 --limitBAMsortRAM 49837599450 --alignInsertionFlush Right --alignMatesGapMax 100000 --alignIntronMax 100000 --readFilesIn /data/BRAINSTORM1_KDM26344_OKDM_1298_20220112104053_R1.concat.fastq.gz /data/BRAINSTORM1_KDM26344_OKDM_1298_20220112104053_R2.concat.fastq.gz --genomeFastaFiles /config/temp/fi_workdir/finspector.fa --outSAMfilter KeepAllAddedReferences --sjdbGTFfile /config/temp/fi_workdir/finspector.gtf --alignSJstitchMismatchNmax 5 -1 5 5 --scoreGapNoncan -6 --readFilesCommand 'gunzip -c'

Jan 14 11:05:59 ..... started STAR run

Jan 14 11:05:59 ..... loading genome

EXITING because of FATAL ERROR: could not open genome file /ref/CTAT/StarFv1.9/gencode_38/ref_genome.fa.star.idx/genomeParameters.txt

SOLUTION: check that the path to genome files, specified in --genomeDir is correct and the files are present, and have user read permsissions

There is my gencode_v38 directory generated :

 .

├──  __chkpts

│ ├──  makeblastdb.ok

│ ├──  ref_annot.gtf.mini.sortu.ok

│ ├──  ref_annot.gtf.ok

│ ├──  ref_genome.fa.ok

│ └──  ref_genome_fai.ok

├──  ctat_genome_lib_build_dir

│ ├──  __chkpts

│ ├──  ref_genome.fa

│ ├──  ref_genome.fa.fai

│ ├──  ref_genome.fa.nhr

│ ├──  ref_genome.fa.nin

│ ├──  ref_genome.fa.nsq

│ └──  ref_genome.fa.star.idx

├──  ref_annot.gtf

├──  ref_annot.gtf.mini.sortu

├──  ref_genome.fa

├──  ref_genome.fa.fai

├──  ref_genome.fa.nhr

├──  ref_genome.fa.nin

├──  ref_genome.fa.nsq

└──  ref_genome.fa.star.idx

├──  chrLength.txt

├──  chrName.txt

├──  chrNameLength.txt

└──  chrStart.txt

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-genome-lib-builder/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX7C2ZZOZXAABMIZKTLUWAQJLANCNFSM5L6ZYE5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas