Closed Oliverfeudj closed 3 months ago
These are the lines for those errors: https://github.com/Xinglab/espresso/blob/v1.4.0/src/ESPRESSO_C.pl#L1200 https://github.com/Xinglab/espresso/blob/v1.4.0/src/ESPRESSO_C.pl#L2015
In the first command the output is redirected to /dev/null
which could have redirected useful error output. You could try running the commands yourself from the commandline to see if there are any other error messages. For the second command the output is redirected to Stress1_out/1/blast_92252//read_SJ_group_1.blast
. Are there any error messages in that file?
Are you able to run the example from the README without any errors?: https://github.com/Xinglab/espresso/tree/v1.4.0?tab=readme-ov-file#example
ESPRESSO_C is known to take a long time. Ideally you could use the snakemake workflow in a cluster environment to speed things up: https://github.com/Xinglab/espresso/issues/5
Hello @EricKutschera and thank you for your reply I am able to run the test data without any problem, I thought the problem with ESPRESSO_C was with the machine I was running the scripts on since it has only few CPUS so I tried to run on a cluster and Now I have an error of ESPRESSO_Q:
No valid read_final.list can be found in Stress4_out/1. [Fri May 10 15:39:25 2024] Loading annotation [Fri May 10 15:40:00 2024] Summarizing annotated isoforms [Fri May 10 15:40:05 2024] Loading corrected splice junctions and alignment information by ESPRESSO Perl exited with active threads: 16 running and unjoined 0 finished and unjoined 0 running and detached
And when I look into the files of the output I don't see any read_final.list
, I see temporary files like: 10.read_final.tmp, 11.read_final.tmp
and so on...
Regarding the speed of ESPRESSO_C, I am using Nextflow to run my scripts, maybe there is a way to adapt the snakemake method to Nextflow?
Thank you again for your help!!
Here's the line for that error: https://github.com/Xinglab/espresso/blob/v1.4.0/src/ESPRESSO_Q.pl#L389
It's looking for files like {chr}_read_final.txt
in each of the subdirectories that a C step was run for. Basically at this point in the code the Q step is looking for all the results from the C steps so it can aggregate the results. The error is saying that for one of the C steps there aren't any reads in the output which suggests that there was an error in that C step (maybe the same errors from the original post)
I'm not very familiar with Nextflow, but the main thing the snakemake does to address the C step running time is to split the C step up into smaller jobs with these two scripts: split_espresso_s_output_for_c.py: https://github.com/Xinglab/espresso/blob/v1.4.0/snakemake/Snakefile#L458 combine_espresso_c_output_for_q.py: https://github.com/Xinglab/espresso/blob/v1.4.0/snakemake/Snakefile#L552
After the split script is run the snakemake checks to see how many C jobs it needs to run. It might not be easy to get that to work automatically with Nextflow
Hello @EricKutschera
I have been getting the following error with ESPRESSO_C and I can't figure out why, can you please help me
Also, the process takes forever and still throws an error like this
Thank you for your help