sagnikbanerjee15 / Finder

A fully automated gene annotator from RNA-Seq expression data
MIT License
54 stars 14 forks source link

Issue running Finder #46

Open nousiaso opened 2 years ago

nousiaso commented 2 years ago

Hello, bellow i post the output of the run with finder pipeline. It looks like some alignment files that should be produced by Finder are not produced, STAR aligner is installed of course and working, I also installed psiclass, although i do not know if it was necessary, genemark is in etc... But of course, it doesnt go as far, this are the initial steps.

So why this happens?

cp: '/home/orestis/../x_soft_all.fasta' and '/home/orestis/../iquitos_soft_all.fasta' are the same file cat: /home/orestis/../alignments/RNA_all_round1_SJ.out.tab: No such file or directory cat: /home/orestis/../alignments/cacao_RNA_all_round2_SJ.out.tab: No such file or directory mv: cannot stat '/home/orestis/../alignments/RNA_all_final_Log.final.out': No such file or directory cat: /home/orestis/../alignments/RNA_all_round3_SJ.out.tab: No such file or directory samtools index: "/home/orestis/../alignments/RNA_all_final.sortedByCoord.out.bam" is in a format that cannot be usefully indexed samtools index: "/home/orestis/../alignments/RNA_all_final.sortedByCoord.out.bam" is in a format that cannot be usefully indexed [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). Can not open /home/orestis/../alignments/RNA_all_final.sortedByCoord.out.bam. [main_samview] fail to read the header from "/home/orestis/../alignments/RNA_all_final.sortedByCoord.out.bam". [main_samview] fail to read the header from "/home/orestis/../alignments/RNA_all_for_psiclass.sam". mv: cannot stat '/home/orestis/../assemblies_psiclass_modified/combined/psiclass_output_sample_0.gtf': No such file or directory mv: cannot stat '/home/orestis/../assemblies_psiclass_modified/combined/psiclass_output_vote.gtf': No such file or directory Traceback (most recent call last): File "/home/orestis/FINDER/finder_v1.1.0/finder", line 688, in main() File "/home/orestis/FINDER/finder_v1.1.0/finder", line 649, in main orchestrateGeneModelPrediction( options, logger_proxy, logging_mutex ) File "/home/orestis/FINDER/finder_v1.1.0/finder", line 461, in orchestrateGeneModelPrediction findTranscriptsInEachSampleNotReportedInCombinedAnnotations( options, logger_proxy, logging_mutex ) File "/home/orestis/FINDER/finder_v1.1.0/scripts/findTranscriptsInEachSampleNotReportedInCombinedAnnotations.py", line 17, in findTranscriptsInEachSampleNotReportedInCombinedAnnotations combined_transcript_info = readAllTranscriptsFromGTFFileInParallel( [combined_gtf_filename, "combined", "combined"] )[0] File "/home/orestis/FINDER/finder_v1.1.0/scripts/fileReadWriteOperations.py", line 290, in readAllTranscriptsFromGTFFileInParallel fhr = open( gtf_filename, "r" ) FileNotFoundError: [Errno 2] No such file or directory: '/home/orestis/../assemblies_psiclass_modified/combined/combined.gtf'

sagnikbanerjee15 commented 2 years ago

Hello @nousiaso,

Thank you so much for your interest in finder. Are you running it within the docker container? Also, could you please provide me with your metadata file? That will help me zero in on the problem. You can email it to me at sagnikbanerjee15@gmail.com if you do not wish to post it here.

Thank you.

nousiaso commented 2 years ago

Hello @sagnikbanerjee15

Thank you for the quick reply, I am not running docker, I am running a local installation. I will email you the metadata file.

Best

MicroSeq commented 2 years ago

I appear to be having a similar issue using the Docker container. The STAR alignment step is generating empty files and it's not clear why (I don't believe the input files are actually being mapped/read - the process is taking seconds despite what the output below suggests ...). Indexing of the genome seems successful. Greatly appreciate any assistance in troubleshooting this.

The input file locations/names seem to be correctly referred to from the metadata?

e.g: BioProject SRA Accession Tissues Description Date Read Length (bp) Ended RNA Seq process Location Oppia R2100233S1 Control PCR;Illumina NextSeq 550;ribodepleted;soil-OECD na 100 PE 1 1 /RAID/data/Analysis/Data/MitesWork/onitens-assembly-2022/fastq/ribodepletion Oppia R2100234S2 Control PCR;Illumina NextSeq 550;ribodepleted;soil-2 na 100 PE 1 1 /RAID/data/Analysis/Data/MitesWork/onitens-assembly-2022/fastq/ribodepletion

Files are labeled as R2100234S2_1.fastq and R2100234S2_2.fastq for PE reads (gunzipped files in case compression was an issue as the input filenames were missing the .gz extensions when they were compressed).

The error log refers to expected files that were not generated:

cat: /RAID/data/Analysis/Data/MitesWork/onitens-assembly-2022/Finder/Finder-finder_v1.1.0/Finder_Oppia/alignments/R2100233S1_round1_SJ.out.tab: No such file or directory

Section of the progress.log file:

2022-05-06 01:07:23,037 - finder - INFO - STAR Round2 run for R2100225S15 completed 2022-05-06 01:07:23,038 - finder - INFO - Mapping of reads for round2 completed for Control 2022-05-06 01:07:23,043 - finder - INFO - Selecting high confidence junctions after round2 mapping completed for Control 2022-05-06 01:07:23,043 - finder - INFO - Mapping rate in round2 Control R2100233S1 0.0 2022-05-06 01:07:23,043 - finder - INFO - Mapping rate in round2 Control R2100234S2 0.0 2022-05-06 01:07:23,044 - finder - INFO - Mapping rate in round2 Control R2100235S3 0.0 2022-05-06 01:07:23,044 - finder - INFO - Mapping rate in round2 Control R2100237S5 0.0 2022-05-06 01:07:23,044 - finder - INFO - Mapping rate in round2 Control R2100238S6 0.0 2022-05-06 01:07:23,044 - finder - INFO - Mapping rate in round2 Control R2100240S8 0.0 2022-05-06 01:07:23,045 - finder - INFO - Mapping rate in round2 Control R2100246S14 0.0 2022-05-06 01:07:23,045 - finder - INFO - Mapping rate in round2 Control R2100247S15 0.0 2022-05-06 01:07:23,045 - finder - INFO - Mapping rate in round2 Control R2100211S1 0.0 2022-05-06 01:07:23,045 - finder - INFO - Mapping rate in round2 Control R2100212S2 0.0 2022-05-06 01:07:23,046 - finder - INFO - Mapping rate in round2 Control R2100213S3 0.0 2022-05-06 01:07:23,046 - finder - INFO - Mapping rate in round2 Control R2100215S5 0.0 2022-05-06 01:07:23,046 - finder - INFO - Mapping rate in round2 Control R2100216S6 0.0 2022-05-06 01:07:23,046 - finder - INFO - Mapping rate in round2 Control R2100218S8 0.0 2022-05-06 01:07:23,047 - finder - INFO - Mapping rate in round2 Control R2100224S14 0.0 2022-05-06 01:07:23,047 - finder - INFO - Mapping rate in round2 Control R2100225S15 0.0 2022-05-06 01:07:23,047 - finder - INFO - Resorting to alignment with relaxed parameters for these runs due to poor mapping R2100233S1,R2100234S2,R2100235S3,R2100237S5,R2100238S6,R2100240S8,R2100246S14,R2100247S15,R2100211S1,R2100212S2,R2100213S3,R2100215S5,R2100216S6,R2100218S8,R2100224S14,R2100225S15 2022-05-06 01:07:23,058 - finder - INFO - STAR relaxed alignment run for R2100233S1 completed 2022-05-06 01:07:23,068 - finder - INFO - STAR relaxed alignment run for R2100234S2 completed 2022-05-06 01:07:23,078 - finder - INFO - STAR relaxed alignment run for R2100235S3 completed 2022-05-06 01:07:23,088 - finder - INFO - STAR relaxed alignment run for R2100237S5 completed 2022-05-06 01:07:23,097 - finder - INFO - STAR relaxed alignment run for R2100238S6 completed 2022-05-06 01:07:23,107 - finder - INFO - STAR relaxed alignment run for R2100240S8 completed 2022-05-06 01:07:23,117 - finder - INFO - STAR relaxed alignment run for R2100246S14 completed 2022-05-06 01:07:23,128 - finder - INFO - STAR relaxed alignment run for R2100247S15 completed 2022-05-06 01:07:23,138 - finder - INFO - STAR relaxed alignment run for R2100211S1 completed 2022-05-06 01:07:23,147 - finder - INFO - STAR relaxed alignment run for R2100212S2 completed 2022-05-06 01:07:23,157 - finder - INFO - STAR relaxed alignment run for R2100213S3 completed 2022-05-06 01:07:23,166 - finder - INFO - STAR relaxed alignment run for R2100215S5 completed 2022-05-06 01:07:23,177 - finder - INFO - STAR relaxed alignment run for R2100216S6 completed 2022-05-06 01:07:23,186 - finder - INFO - STAR relaxed alignment run for R2100218S8 completed 2022-05-06 01:07:23,195 - finder - INFO - STAR relaxed alignment run for R2100224S14 completed 2022-05-06 01:07:23,204 - finder - INFO - STAR relaxed alignment run for R2100225S15 completed 2022-05-06 01:07:23,204 - finder - INFO - Mapping of reads for round3 completed for Control

sagnikbanerjee15 commented 2 years ago

Hi @SeqSmith,

Thank you reporting this issue. Currently, I am working on an improved version of Finder making major changes to some of the steps. I will let you know once that is available.

Thank you.

bimbim1803 commented 1 year ago

Hello, I have the same issue. Do you have a solution for it? I'm not running it with docker as I use HPC cluster. Please let me know if you have any questions. Thanks!