Closed Xiaofei-git closed 2 years ago
Hi @Xiaofei-git, can you share a log?
Hi @Xiaofei-git, can you share a log?
2021-09-22 08:44:36,076 Starting quantification mode
2021-09-22 08:44:36,077 Collecting FASTQ files...
2021-09-22 08:44:36,078 The input dataset is considered as a paired-ends dataset.
2021-09-22 08:44:36,078 Collected 1 FASTQ files.
2021-09-22 08:44:36,078 Quantification has been finished.
2021-09-22 08:44:36,078 Running Salmon using Snakemake
Building DAG of jobs...
2021-09-22 08:44:36,199 Building DAG of jobs...
Using shell: /bin/bash
2021-09-22 08:44:36,215 Using shell: /bin/bash
Provided cores: 1
2021-09-22 08:44:36,215 Provided cores: 1
Rules claiming more threads will be scaled down.
2021-09-22 08:44:36,215 Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 collect_abundance
2
2021-09-22 08:44:36,216 Job counts:
count jobs
1 all
1 collect_abundance
2
2021-09-22 08:44:36,216
rule collect_abundance:
output: results_paired/EXPR.csv
jobid: 1
2021-09-22 08:44:36,216 rule collect_abundance:
output: results_paired/EXPR.csv
jobid: 1
2021-09-22 08:44:36,216
ESC[33mBuilding DAG of jobs...ESC[0m
ESC[33mUsing shell: /bin/bashESC[0m
ESC[33mJob counts:
count jobs
1 collect_abundance
1ESC[0m
ESC[33mComplete log: /tmp/nxf.hXy35XhRYm/.snakemake/log/2021-09-22T084436.378517.snakemake.logESC[0m
Finished job 1.
2021-09-22 08:44:36,672 Finished job 1.
1 of 2 steps (50%) done
2021-09-22 08:44:36,672 1 of 2 steps (50%) done
2021-09-22 08:44:36,673
localrule all:
input: results_paired/EXPR.csv
jobid: 0
2021-09-22 08:44:36,673 localrule all:
input: results_paired/EXPR.csv
jobid: 0
2021-09-22 08:44:36,673
Finished job 0.
2021-09-22 08:44:36,674 Finished job 0.
2 of 2 steps (100%) done
2021-09-22 08:44:36,674 2 of 2 steps (100%) done
Complete log: /tmp/nxf.hXy35XhRYm/.snakemake/log/2021-09-22T084436.177434.snakemake.log
2021-09-22 08:44:36,674 Complete log: /tmp/nxf.hXy35XhRYm/.snakemake/log/2021-09-22T084436.177434.snakemake.log
$ more 2021-09-22T084436.177434.snakemake.log
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 collect_abundance
2
rule collect_abundance:
output: results_paired/EXPR.csv
jobid: 1
Finished job 1.
1 of 2 steps (50%) done
localrule all:
input: results_paired/EXPR.csv
jobid: 0
Finished job 0.
2 of 2 steps (100%) done
Complete log: /tmp/nxf.hXy35XhRYm/.snakemake/log/2021-09-22T084436.177434.snakemake.log
$ more 2021-09-22T084436.378517.snakemake.log
Building DAG of jobs...
Using shell: /bin/bash
Job counts:
count jobs
1 collect_abundance
1
Complete log: /tmp/nxf.hXy35XhRYm/.snakemake/log/2021-09-22T084436.378517.snakemake.log
@Xiaofei-git, sorry for the late response. It seems that quantification has been done, but it doesn't create EXPR.csv
file. Can you let me know what command line you use and where the files are stored (in local or S3)?
Thank you,
Hyun-Hwan Jeong
@Xiaofei-git, sorry for the late response. It seems that quantification has been done, but it doesn't create
EXPR.csv
file. Can you let me know what command line you use and where the files are stored (in local or S3)?Thank you,
Hyun-Hwan Jeong
Hi @hyunhwan-jeong, we used nextflow to submit AWS Batch jobs. Here is the part of code for the paired process. The files are stored/published in S3.
Thanks a lot for your help!
process TEdiscoveryPaired {
tag "${samplePaired}"
publishDir path: params.outputDir, saveAs: { dirname -> "$samplePaired"+'_paired_results' }, mode: 'copy', overwrite: true
time '1h'
memory '16 GB'
disk '100 GB'
cpus params.cpus
echo false
errorStrategy 'ignore'
stageInMode 'symlink'
stageOutMode 'rsync'
input:
val(reference) from params.reference
tuple val(samplePaired), path(read1), path(read2) from paired_fastq_ch
val(threads) from params.salmonTE.threads
output:
path('results_paired') optional true into paired_output_ch
when:
read1.toString().startsWith('OK_') && read2.toString().startsWith('OK_')
script:
"""
echo "Running SalmonTE: Paired-end reads"
output_file="\$(date +"%Y-%m-%dT%H%M")."$samplePaired".paired.out"
mkdir FASTQ
mv $read1 FASTQ/
mv $read2 FASTQ/
SalmonTE.py quant --reference=$reference --num_threads=$threads --outpath=results_paired FASTQ/ 2> "\$output_file"
mv .snakemake/log/ results_paired/
mv "\$output_file" results_paired/log
"""
}
@hyunhwan-jeong Do you know why there is no "run_salmon_fq"? The below log is what I am expecting. But, there is no "run_salmon_fq" in my log file above.
Job counts:
count jobs
1 all
1 collect_abundance
2 run_salmon_fq
4
2021-09-22 08:28:32,737 Job counts:
count jobs
1 all
1 collect_abundance
2 run_salmon_fq
4
Here is the ".command.sh" for one of the samples from the AWS Batch:
#!/bin/bash -ue
echo "Running SalmonTE: Paired-end reads"
output_file="$(date +"%Y-%m-%dT%H%M")."tcga-bam-4".paired.out"
mkdir FASTQ
mv OK_tcga-bam-4_R1.fastq.gz FASTQ/
mv OK_tcga-bam-4_R2.fastq.gz FASTQ/
SalmonTE.py quant --reference=hs --num_threads=2 --exprtype=count --outpath=results_paired FASTQ/ 2> "$output_file"
mv .snakemake/log/ results_paired/
mv "$output_file" results_paired/log
@Xiaofei-git, sorry for the late response again. I had a personal matter so I was not able to respond. Do you still have the problem? If so, would you mind creating an account on your AWS account?
Thank you,
Hyun-Hwan Jeong
@Xiaofei-git, sorry for the late response again. I had a personal matter so I was not able to respond. Do you still have the problem? If so, would you mind creating an account on your AWS account?
Thank you,
Hyun-Hwan Jeong
Thank you so much for your reply!
I think we have fixed this issue. We built a new docker image with updated version 0.4 of SalmonTE and upgraded snakemake, and also changed the snakemake/Snakemake.paired#Line58: is to ==.
Thanks a lot!
Xiaofei
Dear @hyunhwan-jeong ,
I run SalmonTE.py quant on AWS Batch using docker image wwliao/salmonte:latest and got empty EXPR.csv, same issue as here https://github.com/hyunhwan-jeong/SalmonTE/issues/18 . I also tried to change the name but the problem does not figure out.
Could you please let me know what else information do you need to fix the issue? Do you think is it associated with SalmonTE or snakemake version?
The issue only happened to paired-end data. Actually, the status is SUCCEEDED for the job and no error reported.
If there are both CTRL_R1.fastq and CTRL_R2.fastq, it is with empty EXPR.csv
If there is only CTRL_1_R1.fastq, it worked out.