williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

Reference Building Failed #169

Closed Lei-Guo closed 1 year ago

Lei-Guo commented 1 year ago

Reference building failed no matter what I do. I went through all the issues and answers on Github. None of them solved this problem.

I also tried building the reference directly from the STAR Index on my system. Failed with same errors.

Tools I used: samtools/1.10 star/2.7.3a bedtools/2.29.2

Genome & GTF: Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa, Homo_sapiens.GRCh38.100.gtf (both were downloaded from ensembl)

Command:

mkdir IRFinder-1.3.1/REF/Human-GRCH38-release100

ln -s Genome/GRCH38/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa IRFinder-1.3.1/REF/Human-GRCH38-release100/genome.fa

ln -s /GTF/GRCH38/Homo_sapiens.GRCh38.100.gtf IRFinder-1.3.1/REF/Human-GRCH38-release100/transcripts.gtf

IRFinder-1.3.1/bin/IRFinder -m BuildRefProcess -r IRFinder-1.3.1/REF/Human-GRCH38-release100 \
  -e IRFinder-1.3.1/REF/extra-input-files/RNA.SpikeIn.ERCC.fasta.gz \
  -b IRFinder-1.3.1/REF/extra-input-files/Human_hg38_nonPolyA_ROI.bed

Launching reference build process. The full build might take hours. <Phase 1: STAR Reference Preparation> Feb 24 07:15:49 ..... started STAR run Feb 24 07:15:49 ... starting to generate Genome files Feb 24 07:16:51 ... starting to sort Suffix Array. This may take a long time... Feb 24 07:17:08 ... sorting Suffix Array chunks and saving them to disk... ^[[C Feb 24 07:30:43 ... loading chunks from disk, packing SA... Feb 24 07:31:48 ... finished generating suffix array Feb 24 07:31:48 ... generating Suffix Array index Feb 24 07:35:57 ... completed Suffix Array index Feb 24 07:35:57 ..... processing annotations GTF Feb 24 07:36:11 ..... inserting junctions into the genome indices Feb 24 07:39:27 ... writing Genome to disk ... Feb 24 07:39:29 ... writing Suffix Array to disk ... Feb 24 07:39:37 ... writing SAindex to disk Feb 24 07:39:38 ..... finished successfully <Phase 2: Mapability Calculation> Feb 24 07:39:38 ... mapping genome fragments back to genome... Feb 24 07:53:55 ... sorting aligned genome fragments... [bam_sort_core] merging from 48 files and 24 in-memory blocks... Feb 24 07:57:29 ... indexing aligned genome fragments... Feb 24 07:57:56 ... filtering aligned genome fragments by chromosome/scaffold... Feb 24 08:00:09 ... merging filtered genome fragments... Feb 24 08:00:24 ... calculating regions for exclusion... Feb 24 08:04:35 ... cleaning temporary files... <Phase 3: IRFinder Reference Preparation> Feb 24 08:04:35 ... building Ref 1... sort: unknown subpragma '_mergesort' at IRFinder-1.3.1/bin/util/gtf2bed-custom.pl line 29. BEGIN failed--compilation aborted at IRFinder-1.3.1/bin/util/gtf2bed-custom.pl line 29. Feb 24 08:04:35 ... building Ref 2... Feb 24 08:04:37 ... building Ref 3... Feb 24 08:04:37 ... building Ref 4... Feb 24 08:04:40 ... building Ref 5... Feb 24 08:04:44 ... building Ref 6... Feb 24 08:04:44 ... building Ref 7... Feb 24 08:04:44 ... building Ref 8... Feb 24 08:04:44 ... building Ref 9... Feb 24 08:04:44 ... building Ref 10c... Feb 24 08:04:44 ... building Ref 11c... Error: exclude.directional.bed is empty. Error: introns.unique.bed is empty. Error: ref-cover.bed is empty. Error: ref-read-continues.ref is empty. Error: ref-sj.ref is empty. Error: IRFinder reference building FAILED.

dg520 commented 1 year ago

@Lei-Guo Your Perl doesn't support _mergesort. Which version are you using? Can you try to update it to at least 5.028?

Lei-Guo commented 1 year ago

Thank you for the quick response.

~/miniconda3/bin/perl

This is perl 5, version 32, subversion 1 (v5.32.1) built for x86_64-linux-thread-multi

dg520 commented 1 year ago

@Lei-Guo If you are sure that is the default Perl to be called, you can open IRFinder/bin/util/gtf2bed-custom.pl, comment out Line 29 which judges the Perl version, save it and then re-run the reference preparation from the beginning.

lostinbioinformatics commented 1 year ago

Hello, I am working on S. pombe genome and I am currently encountering exactly the same error message. Did you figured out the problem?

dg520 commented 1 year ago

@lostinbioinformatics Did you exactly see

sort: unknown subpragma '_mergesort' at IRFinder-1.3.1/bin/util/gtf2bed-custom.pl line 29.
BEGIN failed--compilation aborted at IRFinder-1.3.1/bin/util/gtf2bed-custom.pl line 29.

among your error message? If so check the version of your default Perl. When you believe your Perl version meets IRFinder request, you can try the suggested solution right above your post in this thread.

Lei-Guo commented 1 year ago

I fixed it by deleting/commenting out line 29 in IRFinder-1.3.1/bin/util/gtf2bed-custom.pl.

lostinbioinformatics commented 1 year ago

Thank you very much for your quick answer. It worked by deleting the line !