hyunhwan-jeong / SalmonTE

SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances
GNU General Public License v3.0
80 stars 23 forks source link

Exiting because a job execution failed. #68

Open Bioleonard opened 2 years ago

Bioleonard commented 2 years ago

Hello,

Thanks for developing this powerful tool! I got an error when I run SalmonTE.py quant mode:

It looks right when do salmon quant but didn't generate any EXPR.csv and clades.csv files in outpath , should I have to create any file before doing quant?

2022-06-01 17:03:32,068 Collecting FASTQ files...
2022-06-01 17:03:32,068 SalmonTE assumes that '/home/bioleon/project/methyl/rnaseq/readsQC/' is a directory, and SalmonTE will search any FASTQ file in the directory.
2022-06-01 17:03:32,083 The input dataset is considered as a paired-ends dataset.
2022-06-01 17:03:32,084 Collected 6 FASTQ files.
2022-06-01 17:03:32,084 Quantification has been finished.
2022-06-01 17:03:32,084 Running Salmon using Snakemake
Building DAG of jobs...
2022-06-01 17:03:32,276 Building DAG of jobs...
Using shell: /usr/bin/bash
2022-06-01 17:03:32,322 Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
2022-06-01 17:03:32,322 Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
2022-06-01 17:03:32,322 Rules claiming more threads will be scaled down.
Job stats:
job                    count    min threads    max threads
-------------------  -------  -------------  -------------
all                        1              1              1
collect_abundance          1              1              1
collect_mappability        1              1              1
run_salmon_gz              6              1              1
total                      9              1              1

2022-06-01 17:03:32,326 Job stats:
job                    count    min threads    max threads
-------------------  -------  -------------  -------------
all                        1              1              1
collect_abundance          1              1              1
collect_mappability        1              1              1
run_salmon_gz              6              1              1
total                      9              1              1

Select jobs to execute...
2022-06-01 17:03:32,327 Select jobs to execute...
Select jobs to execute...
2022-06-01 17:06:45,402 Select jobs to execute...
Select jobs to execute...
2022-06-01 17:08:41,318 Select jobs to execute...
Select jobs to execute...
2022-06-01 17:11:19,579 Select jobs to execute...
Select jobs to execute...
2022-06-01 17:13:04,265 Select jobs to execute...
Select jobs to execute...
2022-06-01 17:14:57,090 Select jobs to execute...
Select jobs to execute...
2022-06-01 17:15:07,255 Select jobs to execute...
Shutting down, this might take some time.
2022-06-01 17:15:07,808 Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
2022-06-01 17:15:07,808 Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-06-01T170332.217095.snakemake.log
2022-06-01 17:15:07,809 Complete log: .snakemake/log/2022-06-01T170332.217095.snakemake.log
Traceback (most recent call last):
  File "/home/bioleon/github/SalmonTE//SalmonTE.py", line 292, in <module>
    run(args)
  File "/home/bioleon/github/SalmonTE//SalmonTE.py", line 243, in run
    run_salmon(param)
  File "/home/bioleon/github/SalmonTE//SalmonTE.py", line 156, in run_salmon
    with open(os.path.join(param["--outpath"], "EXPR.csv" ), "r") as inp:
FileNotFoundError: [Errno 2] No such file or directory: '/home/bioleon/project/methyl/rnaseq/SalmonTE_output/EXPR.csv'

I get salmon quant outcome in out path but no any other outcomes from SalmonTE.

This is my outpath file:

SRR14005146  SRR14005147  SRR14005149  SRR14005151  SRR14005152  SRR14005153

cd SRR14005146 

ls
 aux_info  cmd_info.json  libParams  lib_format_counts.json  logs  quant.sf
hyunhwan-jeong commented 2 years ago

Hi @Bioleonard, what were you able to see when you open .snakemake/log/2022-06-01T170332.217095.snakemake.log? Also, Can you check you are able to execute salmon in your terminal by typing salmon?

Bioleonard commented 2 years ago

The log file didn't have more details I think and my salmon without problems.

$ cat 2022-06-01T170332.217095.snakemake.log

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job                    count    min threads    max threads
-------------------  -------  -------------  -------------
all                        1              1              1
collect_abundance          1              1              1
collect_mappability        1              1              1
run_salmon_gz              6              1              1
total                      9              1              1

Select jobs to execute...
Select jobs to execute...
Select jobs to execute...
Select jobs to execute...
Select jobs to execute...
Select jobs to execute...
Select jobs to execute...
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-06-01T170332.217095.snakemake.log

I think the job execution failed may related one step that summarize the outcome to SalmonTE specific output. Because I only have the salmon outcome. Is there a possibility that I did not install R? Does quant mode require R installation? I didn't install R because I couldn't fix the dependencies of the package all the time. By the way , I download SalmonTE zip file from Github website then move it to my wsl2 ubuntu system. This created a lot of problems, I need to go to chmod files permissions to make SalmonTE executable, but so far it has not been successful.

Due to GitHub's access issues as well as package dependencies, the best solution for this is to upload SalmonTE to bioconda for ease of use.

In addition, I manually integrated the above quantitative results into R and found that clustering between biological replicates is not ideal, is this common for transposon sequences?

Anyway, thanks a lot!