sagnikbanerjee15 / Finder

A fully automated gene annotator from RNA-Seq expression data
MIT License
51 stars 14 forks source link

Finder on local data #24

Open utritala opened 2 years ago

utritala commented 2 years ago

Hello there,

Thank you for a wonderful software. I am trying to use finder with RNAseq data which resides on my local machine. From the logs it seems that its actually not running the STAR alignment but still producing the message that its running. For example:

2021-08-16 14:45:11,921 - finder -   INFO - STAR Round1 run for RSB01_1_cutadapt.fastq.gz completed
2021-08-16 14:45:12,347 - finder -   INFO - STAR Round1 run for RSB01_2_cutadapt.fastq.gz completed
2021-08-16 14:45:12,881 - finder -   INFO - STAR Round1 run for RSB02_1_cutadapt.fastq.gz completed
2021-08-16 14:45:13,418 - finder -   INFO - STAR Round1 run for RSB02_2_cutadapt.fastq.gz completed
2021-08-16 14:45:13,965 - finder -   INFO - STAR Round1 run for RSB03_1_cutadapt.fastq.gz completed
2021-08-16 14:45:14,629 - finder -   INFO - STAR Round1 run for RSB03_2_cutadapt.fastq.gz completed

Because, it's not producing any files, it ultimately finishes with errors.

The metadata file looks something like this:

BioProject,SRA Accession,Tissues,Description,Date,Read Length (bp),Ended,RNA-Seq,process,Location
BAT,RSB01_1_cutadapt.fastq.gz,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq
BAT,RSB01_2_cutadapt.fastq.gz,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq
BAT,RSB02_1_cutadapt.fastq.gz,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq
BAT,RSB02_2_cutadapt.fastq.gz,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq
BAT,RSB03_1_cutadapt.fastq.gz,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq
BAT,RSB03_2_cutadapt.fastq.gz,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq

May be the files are not picked up from the location? Any thoughts on this will be very valuable.

Thank you.

sagnikbanerjee15 commented 2 years ago

Hello @utritala,

Thank you so much for your interest in finder and thanks for pasting all the related files. The metadata file needs to be altered. For each PE sample there should be a single entry (not 2). Also, each of the files for one sample should either end with _1 or _2. You need to rename the files accordingly and change the metadata file. Here is an example:

BioProject,SRA Accession,Tissues,Description,Date,Read Length (bp),Ended,RNA-Seq,process,Location
BAT,RSB01,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq

finder will search for 2 files (since you provide the information that the sample is PE) - /lustre/analysis/annotation/RNAseq/RSB01_1.fastq.gz and /lustre/analysis/annotation/RNAseq/RSB01_2.fastq.gz. You do not need to mention that the files are gzipped. finder will work with whichever file is present (compressed or not). Please give this a try and let me know if you run into any further issues.

Thank you.

utritala commented 2 years ago

Hi @sagnikbanerjee15, Thank you for your prompt response. As you suggested I have remade the metadata file shown below:

BioProject,SRA Accession,Tissues,Description,Date,Read Length (bp),Ended,RNA-Seq,process,Location
BAT,RSB01,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq/trimmed
BAT,RSB02,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq/trimmed
BAT,RSB03,Brain,Uninfected,,,PE,1,1,/lustre/analysis/annotation/RNAseq/trimmed

However, I am still getting error. This time slightly different one:

EXITING: Did not find the genome in memory, did not remove any genomes from shared memory
Aug 16 21:45:14 ...... FATAL ERROR, exiting
cat: /lustre/analysis/annotation/finder/alignments/RSB01_round1_SJ.out.tab: No such file or directory
cat: /lustre/analysis/annotation/finder/alignments/RSB02_round1_SJ.out.tab: No such file or directory
cat: /lustre/analysis/annotation/finder/alignments/RSB03_round1_SJ.out.tab: No such file or directory
Aug 16 21:45:15 ..... started STAR run
Aug 16 21:45:15 ..... loading genome

STAR alignments are not generated and it moves on straight to samtools indexing stage and others which will obviously fail. Any thoughts?

Thanks

sagnikbanerjee15 commented 2 years ago

Hello @utritala,

Thank you for reporting this. I have made a few changes to the code. It should work now. Please let me know if you encounter any issues.

Thank you.