hyunhwan-jeong / SalmonTE

SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances
GNU General Public License v3.0
81 stars 23 forks source link

Output directories must be flagged with directory() #14

Closed Puputnik closed 5 years ago

Puputnik commented 5 years ago

Hello, i tried to run SalmonTE both on example data and on my own data, and the tool is failing. This is the terminal output: `SalmonTE.py quant --reference=hs '/home/filippo/SalmonTE/example/CTRL_1_R1.fastq' '/home/filippo/SalmonTE/example/CTRL_1_R2.fastq' 2018-07-27 17:42:44,688 Starting quantification mode 2018-07-27 17:42:44,688 Collecting FASTQ files... ['/home/filippo/SalmonTE/example/CTRL_1_R1.fastq', '/home/filippo/SalmonTE/example/CTRL_1_R2.fastq'] 2018-07-27 17:42:44,688 The input dataset is considered as a paired-ends dataset. CTRL_1_R1.fastq CTRL_1_R2.fastq 2018-07-27 17:42:44,688 Collected 1 FASTQ files. 2018-07-27 17:42:44,688 Quantification has been finished. 2018-07-27 17:42:44,688 Running Salmon using Snakemake Building DAG of jobs... 2018-07-27 17:42:44,749 Building DAG of jobs... Using shell: /bin/bash 2018-07-27 17:42:44,757 Using shell: /bin/bash Provided cores: 1 2018-07-27 17:42:44,757 Provided cores: 1 Rules claiming more threads will be scaled down. 2018-07-27 17:42:44,757 Rules claiming more threads will be scaled down. Job counts: count jobs 1 all 1 collect_abundance 1 run_salmon_fq 3 2018-07-27 17:42:44,757 Job counts: count jobs 1 all 1 collect_abundance 1 run_salmon_fq 3

2018-07-27 17:42:44,758 rule run_salmon_fq: input: /home/filippo/SalmonTE/reference/hs, /tmp/tmp9fe3ywrf/CTRL_1_R1.fastq, /tmp/tmp9fe3ywrf/CTRL_1_R2.fastq output: /home/filippo/SalmonTE_output/CTRL_1 jobid: 2 wildcards: sample_fq=CTRL_1 2018-07-27 17:42:44,758 rule run_salmon_fq: input: /home/filippo/SalmonTE/reference/hs, /tmp/tmp9fe3ywrf/CTRL_1_R1.fastq, /tmp/tmp9fe3ywrf/CTRL_1_R2.fastq output: /home/filippo/SalmonTE_output/CTRL_1 jobid: 2 wildcards: sample_fq=CTRL_1

2018-07-27 17:42:44,758 Version Info: ### A newer version of Salmon is available. ####

The newest version, available at https://github.com/COMBINE-lab/salmon/releases contains new features, improvements, and bug fixes; please upgrade at your earliest convenience.

ImproperOutputException in line 17 of /home/filippo/SalmonTE/snakemake/Snakefile.paired: Outputs of incorrect type (directories when expecting files or vice versa). Output directories must be flagged with directory(). for rule run_salmon_fq: /home/filippo/SalmonTE_output/CTRL_1 2018-07-27 17:42:45,267 ImproperOutputException in line 17 of /home/filippo/SalmonTE/snakemake/Snakefile.paired: Outputs of incorrect type (directories when expecting files or vice versa). Output directories must be flagged with directory(). for rule run_salmon_fq: /home/filippo/SalmonTE_output/CTRL_1 Removing output files of failed job run_salmon_fq since they might be corrupted: /home/filippo/SalmonTE_output/CTRL_1 2018-07-27 17:42:45,268 Removing output files of failed job run_salmon_fq since they might be corrupted: /home/filippo/SalmonTE_output/CTRL_1 Skipped removing non-empty directory /home/filippo/SalmonTE_output/CTRL_1 2018-07-27 17:42:45,268 Skipped removing non-empty directory /home/filippo/SalmonTE_output/CTRL_1 Shutting down, this might take some time. 2018-07-27 17:42:45,270 Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message 2018-07-27 17:42:45,270 Exiting because a job execution failed. Look above for error message Complete log: /home/filippo/.snakemake/log/2018-07-27T174244.739266.snakemake.log 2018-07-27 17:42:45,271 Complete log: /home/filippo/.snakemake/log/2018-07-27T174244.739266.snakemake.log Traceback (most recent call last): File "/home/filippo/SalmonTE/SalmonTE.py", line 276, in run(args) File "/home/filippo/SalmonTE/SalmonTE.py", line 235, in run run_salmon(param) File "/home/filippo/SalmonTE/SalmonTE.py", line 153, in run_salmon with open(os.path.join(param["--outpath"], "EXPR.csv" ), "r") as inp: FileNotFoundError: [Errno 2] No such file or directory: '/home/filippo/SalmonTE_output/EXPR.csv'`

can you help me?

quirze commented 5 years ago

Hello, I've experienced the same issue, also both on my own data and on the example data of the repository. I also tried it using only the first read pair of the example data and I got the same error:

[qrovira@node-10-07 testing_SalmonTE]$ SalmonTE.py quant --reference=hs --num_threads=6 /home/qrovira/bin/SalmonTE/example_single
2018-07-27 18:11:56,155 Starting quantification mode
2018-07-27 18:11:56,155 Collecting FASTQ files...
['/home/qrovira/bin/SalmonTE/example_single']
2018-07-27 18:11:56,176 The input dataset is considered as a single-end dataset.
2018-07-27 18:11:56,176 The input dataset is considered as a single-end dataset.
2018-07-27 18:11:56,176 The input dataset is considered as a single-end dataset.
2018-07-27 18:11:56,177 The input dataset is considered as a single-end dataset.
2018-07-27 18:11:56,177 Collected 4 FASTQ files.
2018-07-27 18:11:56,177 Quantification has been finished.
2018-07-27 18:11:56,177 Running Salmon using Snakemake
Building DAG of jobs...
2018-07-27 18:11:57,001 Building DAG of jobs...
Using shell: /bin/bash
2018-07-27 18:11:57,119 Using shell: /bin/bash
Provided cores: 1
2018-07-27 18:11:57,120 Provided cores: 1
Rules claiming more threads will be scaled down.
2018-07-27 18:11:57,121 Rules claiming more threads will be scaled down.
Job counts:
  count jobs
  1 all
  1 collect_abundance
  4 run_salmon_fq
  6
2018-07-27 18:11:57,123 Job counts:
  count jobs
  1 all
  1 collect_abundance
  4 run_salmon_fq
  6

2018-07-27 18:11:57,127 
rule run_salmon_fq:
    input: /home/qrovira/bin/SalmonTE/reference/hs, /tmp/tmpmva2jnmk/CTRL_1_R1.fastq
    output: /home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
    jobid: 2
    wildcards: sample_fq=CTRL_1_R1
2018-07-27 18:11:57,128 rule run_salmon_fq:
    input: /home/qrovira/bin/SalmonTE/reference/hs, /tmp/tmpmva2jnmk/CTRL_1_R1.fastq
    output: /home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
    jobid: 2
    wildcards: sample_fq=CTRL_1_R1

2018-07-27 18:11:57,129 
Version Info: ### A newer version of Salmon is available. ####
###
The newest version, available at https://github.com/COMBINE-lab/salmon/releases
contains new features, improvements, and bug fixes; please upgrade at your
earliest convenience.
###
### salmon (mapping-based) v0.8.2
### [ program ] => salmon 
### [ command ] => quant 
### [ index ] => { /home/qrovira/bin/SalmonTE/reference/hs }
### [ libType ] => { A }
### [ unmatedReads ] => { /tmp/tmpmva2jnmk/CTRL_1_R1.fastq }
### [ output ] => { /home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1 }
### [ threads ] => { 6 }
Logs will be written to /home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1/logs
[2018-07-27 18:11:57.240] [jointLog] [info] parsing read library format
[2018-07-27 18:11:57.240] [jointLog] [info] There is 1 library.
[2018-07-27 18:11:57.323] [stderrLog] [info] Loading Suffix Array 
[2018-07-27 18:11:57.321] [jointLog] [info] Loading Quasi index
[2018-07-27 18:11:57.322] [jointLog] [info] Loading 32-bit quasi index
[2018-07-27 18:11:57.333] [stderrLog] [info] Loading Transcript Info 
[2018-07-27 18:11:57.336] [stderrLog] [info] Loading Rank-Select Bit Array
[2018-07-27 18:11:57.337] [stderrLog] [info] There were 687 set bits in the bit array
[2018-07-27 18:11:57.338] [stderrLog] [info] Computing transcript lengths
[2018-07-27 18:11:57.338] [stderrLog] [info] Waiting to finish loading hash
[2018-07-27 18:11:57.468] [stderrLog] [info] Done loading index
[2018-07-27 18:11:57.468] [jointLog] [info] done
[2018-07-27 18:11:57.468] [jointLog] [info] Index contained 687 targets

[2018-07-27 18:11:57.770] [jointLog] [info] Computed 14 rich equivalence classes for further processing
[2018-07-27 18:11:57.770] [jointLog] [info] Counted 24 total reads in the equivalence classes 
[2018-07-27 18:11:57.777] [jointLog] [warning] Only 24 fragments were mapped, but the number of burn-in fragments was set to 5000000.
The effective lengths have been computed using the observed mappings.

[2018-07-27 18:11:57.777] [jointLog] [info] Mapping rate = 0.96%

[2018-07-27 18:11:57.777] [jointLog] [info] finished quantifyLibrary()
[2018-07-27 18:11:57.780] [jointLog] [info] Starting optimizer
[2018-07-27 18:11:57.784] [jointLog] [info] Marked 0 weighted equivalence classes as degenerate
[2018-07-27 18:11:57.784] [jointLog] [info] iteration = 0 | max rel diff. = 0.927762
[2018-07-27 18:11:57.785] [jointLog] [info] iteration = 50 | max rel diff. = 0.002612
[2018-07-27 18:11:57.785] [jointLog] [info] Finished optimizer
[2018-07-27 18:11:57.785] [jointLog] [info] writing output 

[2018-07-27 18:11:57.791] [jointLog] [warning] NOTE: Read Lib [/tmp/tmpmva2jnmk/CTRL_1_R1.fastq] :

Detected a *potential* strand bias > 1% in an unstranded protocol check the file: /home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1/lib_format_counts.json for details

ImproperOutputException in line 17 of /home/qrovira/bin/SalmonTE/snakemake/Snakefile.single:
Outputs of incorrect type (directories when expecting files or vice versa). Output directories must be flagged with directory(). for rule run_salmon_fq:
/home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
2018-07-27 18:11:58,096 ImproperOutputException in line 17 of /home/qrovira/bin/SalmonTE/snakemake/Snakefile.single:
Outputs of incorrect type (directories when expecting files or vice versa). Output directories must be flagged with directory(). for rule run_salmon_fq:
/home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
Removing output files of failed job run_salmon_fq since they might be corrupted:
/home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
2018-07-27 18:11:58,098 Removing output files of failed job run_salmon_fq since they might be corrupted:
/home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
Skipped removing non-empty directory /home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
2018-07-27 18:11:58,099 Skipped removing non-empty directory /home/qrovira/testing_SalmonTE/SalmonTE_output/CTRL_1_R1
Shutting down, this might take some time.
2018-07-27 18:11:58,102 Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
2018-07-27 18:11:58,103 Exiting because a job execution failed. Look above for error message
Complete log: /home/qrovira/testing_SalmonTE/.snakemake/log/2018-07-27T181156.903453.snakemake.log
2018-07-27 18:11:58,104 Complete log: /home/qrovira/testing_SalmonTE/.snakemake/log/2018-07-27T181156.903453.snakemake.log
Traceback (most recent call last):
  File "/home/qrovira/bin/SalmonTE/SalmonTE.py", line 276, in <module>
    run(args)
  File "/home/qrovira/bin/SalmonTE/SalmonTE.py", line 235, in run
    run_salmon(param)
  File "/home/qrovira/bin/SalmonTE/SalmonTE.py", line 153, in run_salmon
    with open(os.path.join(param["--outpath"], "EXPR.csv" ), "r") as inp:
FileNotFoundError: [Errno 2] No such file or directory: '/home/qrovira/testing_SalmonTE/SalmonTE_output/EXPR.csv'

It looks like there might be a bug in the files snakemake/Snakefile.single and snakemake/Snakefile.paired.

I would also appreciate some help. Thank you.

hyunhwan-jeong commented 5 years ago

@Puputnik and @quirze,

Can you show me the list of files in SalmonTE_output directory using commanding ls SalmonTE_output/*? What is your version of snakemake (you can look it up using snakemake -v)?

Hyun-Hwan Jeong

quirze commented 5 years ago

The files in the SalmonTE_output folder are the following:

[qrovira@node-head1 testing_SalmonTE]$ ls SalmonTE_output/*
aux_info  cmd_info.json  lib_format_counts.json  libParams  logs  quant.sf
[qrovira@node-head1 testing_SalmonTE]$ tree SalmonTE_output/
SalmonTE_output/
└── CTRL_1_R1
    ├── aux_info
    │   ├── ambig_info.tsv
    │   ├── expected_bias.gz
    │   ├── fld.gz
    │   ├── meta_info.json
    │   ├── observed_bias_3p.gz
    │   └── observed_bias.gz
    ├── cmd_info.json
    ├── lib_format_counts.json
    ├── libParams
    │   └── flenDist.txt
    ├── logs
    │   └── salmon_quant.log
    └── quant.sf
hyunhwan-jeong commented 5 years ago

@quirze what about the version of the snakemake? I am also wondering how many FASTQ files are in your /home/qrovira/bin/SalmonTE/example_single directory. Do you only have a pair of FASTQ files? But I am seeing 2018-07-27 18:11:56,177 Collected 4 FASTQ files. It would be helpful you show me the output of tree /home/qrovira/bin/SalmonTE/example_single and head of each fastq file in the directory.

Thank you!

Hyun-Hwan Jeong

quirze commented 5 years ago

When I check the version of snakemake I get an error as if I haven't got installed:

[qrovira@node-head1 ~]$ snakemake -h
-bash: snakemake: command not found

However, I did run pip3 install snakemake with no errors.

Regarding to the FASTQ files, I used the ones in the example to test pair-end reads and then I took only the first mate to try the single-end mode:

[qrovira@node-head1 ~]$ tree /home/qrovira/bin/SalmonTE/example_single
/home/qrovira/bin/SalmonTE/example_single
├── CTRL_1_R1.fastq
├── CTRL_2_R1.fastq
├── TARDBP_1_R1.fastq
└── TARDBP_2_R1.fastq

0 directories, 4 files
[qrovira@node-head1 ~]$ tree /home/qrovira/bin/SalmonTE/example
/home/qrovira/bin/SalmonTE/example
├── CTRL_1_R1.fastq
├── CTRL_1_R2.fastq
├── CTRL_2_R1.fastq
├── CTRL_2_R2.fastq
├── TARDBP_1_R1.fastq
├── TARDBP_1_R2.fastq
├── TARDBP_2_R1.fastq
└── TARDBP_2_R2.fastq

0 directories, 8 files

Is there a problem with the snakemake installation?

frankyan commented 5 years ago

I also encountered this problem. My problem was caused by Snakemakes version 5.2. Since version 5.2, directory outputs have to marked with directory. Then I installed Snakemakes version 5.1.5. Everything is ok now.

hyunhwan-jeong commented 5 years ago

@frankyan thanks for reporting that. Yes, you're right. There is an incompatibility between the SalmonTE rules and the latest version of snakemake. I will look for the issue.

@quirze and @Puputnik, could you please uninstall your current snakemake installation and install the snakemake 5.1.5 to see whether it solves your issue?

Thank you all!

Hyun-Hwan Jeong

quirze commented 5 years ago

Indeed, after uninstalling snakemake and installing the version 5.1.5 now I get no error when running SalmonTE.

pip3 uninstall snakemake
pip3 install -Iv snakemake==5.1.5

Thank you for the help.

Puputnik commented 5 years ago

Thank you Very much! Tomorrow morning i Will check if it works.

Thank you again!

Il 29/Lug/2018 23:03, "Quirze" notifications@github.com ha scritto:

Indeed, after uninstalling snakemake and installing the version 5.1.5 now I get no error when running SalmonTE.

pip3 uninstall snakemake pip3 install -Iv snakemake==5.1.5

Thank you for the help.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hyunhwaj/SalmonTE/issues/14#issuecomment-408705766, or mute the thread https://github.com/notifications/unsubscribe-auth/AhYff_38VKKKOVkkh_55HiqItHyyLkfZks5uLiMvgaJpZM4VjuUf .

Puputnik commented 5 years ago

working also for me!

Thank you very much.

2018-07-29 23:38 GMT+02:00 filippo martignano filippo.martignano@gmail.com :

Thank you Very much! Tomorrow morning i Will check if it works.

Thank you again!

Il 29/Lug/2018 23:03, "Quirze" notifications@github.com ha scritto:

Indeed, after uninstalling snakemake and installing the version 5.1.5 now I get no error when running SalmonTE.

pip3 uninstall snakemake pip3 install -Iv snakemake==5.1.5

Thank you for the help.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hyunhwaj/SalmonTE/issues/14#issuecomment-408705766, or mute the thread https://github.com/notifications/unsubscribe-auth/AhYff_38VKKKOVkkh_55HiqItHyyLkfZks5uLiMvgaJpZM4VjuUf .

hyunhwan-jeong commented 5 years ago

@Puputnik and @quirze glad to hear you've solved the issue, and thanks to @frankyan, again! I will look for it how to fix the issue for the latest version of snakemake.

Best,

Hyun-Hwan Jeong

hyunhwan-jeong commented 5 years ago

Now SalmonTE can work with the latest version of snakemake!

Hyun-Hwan Jeong