maxplanck-ie / snakepipes

Customizable workflows based on snakemake and python for the analysis of NGS data
http://snakepipes.readthedocs.io
378 stars 85 forks source link

ATAC example miss /bamCoverage/*.filtered.seq_depth_norm.bw files #910

Open HamletShaoE opened 1 year ago

HamletShaoE commented 1 year ago

I have installed snakepipes and downloaded ATACseq example for test run. However, the example does not run as I expected. I first run following the suggestion from command.sh and get this:

(snakePipes) shaoyi@login02 ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG dm6_ensembl_release94 Sample sheet found and header is ok!

---- This analysis has been done using snakePipes version 2.7.3 ---- Sample sheet found and header is ok! SystemExit in line 33 of /GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile: 1 File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/Snakefile", line 26, in File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile", line 33, in File "", line 26, in call ERROR: Required file "/GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/bamCoverage/SRR7013050.filtered.seq_depth_norm.bw" for sample "SRR7013050" specified in configuration file is NOT available. Error: snakemake returned an error code of 1, so processing is incomplete!

I suppose it is because bamCoverage/SRR7013050.filtered.seq_depth_norm.bw is missing. I notice files in filtered_bam folder are links to Bowtie2. So I run with flag --fromBAM. This runs for a while and slurm goes well, yet end with error in the step of link_bam_bai_external and its' error message is

(snakePipes) shaoyi@login01 ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG --fromBAM filtered_bam dm6_ensembl_release94 ... [Thu Jul 27 11:25:09 2023] Error in rule link_bam_bai_external: jobid: 10 input: EXTERNAL_BAM/SRR7013046.bam, EXTERNAL_BAM/SRR7013046.bam.bai output: filtered_bam/SRR7013046.filtered.bam, filtered_bam/SRR7013046.filtered.bam.bai shell:

            ln -s ../EXTERNAL_BAM/SRR7013046.bam filtered_bam/SRR7013046.filtered.bam;
            ln -s ../EXTERNAL_BAM/SRR7013046.bam.bai filtered_bam/SRR7013046.filtered.bam.bai

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 5652530

Error executing rule link_bam_bai_external on cluster (jobid: 10, external: Submitted batch job 5652530, jobscript: /GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/.snakemake/tmp.onwy0y99/snakejob.link_bam_bai_external.10.sh). For error details see the cluster log and the log files of the involved rule(s). Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-07-27T110334.970475.snakemake.log

!!! ERROR in the ATAC-seq open chromatin workflow! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Error: snakemake returned an error code of 1, so processing is incomplete!

I notice that EXTERNAL_BAM are links to files in filtered_bam, and this command is trying to link files in filtered_bam back to EXTERNAL_BAM.

I would appreciate if *.filtered.seq_depth_norm.bw files are provided. Or any other suggestions that would help me go through the test.

NixBio commented 1 year ago

Thank you for your message. I am on a conference till July 28, 2023. I will answer your email once I am back in my office.

In urgent cases, please, contact genomics-core(at)rcii.de

Kind Regards, Nicholas Strieder

-- Dr. rer. nat. Nicholas Strieder ~~

Leibniz-Institut für Immuntherapie - LIT NGS Core - Bininformatics Universitätsklinikum Regensburg Franz-Josef-Strauß-Allee 11 93053 Regensburg Germany

Phone: ++49 (0)941 944 18188 E-mail: @.***

HamletShaoE @.***> 27.7.23 07:20 >>>

I have installed snakepipes and downloaded ATACseq example for test run. However, the example does not run as I expected. I first run following the suggestion from command.sh and get this:

(snakePipes) @.*** ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG dm6_ensembl_release94 Sample sheet found and header is ok!

---- This analysis has been done using snakePipes version 2.7.3 ---- Sample sheet found and header is ok! SystemExit in line 33 of /GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile: 1 File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/Snakefile", line 26, in File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile", line 33, in File "", line 26, in call ERROR: Required file "/GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/bamCoverage/SRR7013050.filtered.seq_depth_norm.bw" for sample "SRR7013050" specified in configuration file is NOT available. Error: snakemake returned an error code of 1, so processing is incomplete!

I suppose it is because bamCoverage/SRR7013050.filtered.seq_depth_norm.bw is missing. I notice files in filtered_bam folder are links to Bowtie2. So I run with flag --fromBAM. This runs for a while and slurm goes well, yet end with error in the step of link_bam_bai_external and its' error message is

(snakePipes) @.*** ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG --fromBAM filtered_bam dm6_ensembl_release94 ... [Thu Jul 27 11:25:09 2023] Error in rule link_bam_bai_external: jobid: 10 input: EXTERNAL_BAM/SRR7013046.bam, EXTERNAL_BAM/SRR7013046.bam.bai output: filtered_bam/SRR7013046.filtered.bam, filtered_bam/SRR7013046.filtered.bam.bai shell:

            ln -s ../EXTERNAL_BAM/SRR7013046.bam filtered_bam/SRR7013046.filtered.bam;
            ln -s ../EXTERNAL_BAM/SRR7013046.bam.bai filtered_bam/SRR7013046.filtered.bam.bai

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 5652530

Error executing rule link_bam_bai_external on cluster (jobid: 10, external: Submitted batch job 5652530, jobscript: /GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/.snakemake/tmp.onwy0y99/snakejob.link_bam_bai_external.10.sh). For error details see the cluster log and the log files of the involved rule(s). Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-07-27T110334.970475.snakemake.log

!!! ERROR in the ATAC-seq open chromatin workflow! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Error: snakemake returned an error code of 1, so processing is incomplete!

I notice that EXTERNAL_BAM are links to files in filtered_bam, and this command is trying to link files in filtered_bam back to EXTERNAL_BAM.

I would appreciate if *.filtered.seq_depth_norm.bw files are provided. Or any other suggestions that would help me go through the test.

-- Reply to this email directly or view it on GitHub: https://github.com/maxplanck-ie/snakepipes/issues/910 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

katsikora commented 1 year ago

Hi,

did your DNA-mapping run prior to running the ATAC-seq workflow in it produce any errors?

Second: when you run ATAC-seq with --fromBAM, did you specify a new output folder, other than that hosting the bam files?

Best wishes,

Katarzyna

HamletShaoE commented 1 year ago

Hi,

did your DNA-mapping run prior to running the ATAC-seq workflow in it produce any errors?

Second: when you run ATAC-seq with --fromBAM, did you specify a new output folder, other than that hosting the bam files?

Best wishes,

Katarzyna

Hi, thank you for replying. I download the test data listed in the [snakePipes page](https://snakepipes.readthedocs.io/en/latest/content/setting_up.html) for setting up. The data I downloaded named "ATACseq.tar.gz", whose md5 is 7c822de19958d69bcebe22b1d48885a3. In this file, only bam files contained in Bowtie2 folder and some bamPEFragments contained in deepTools_qc folder. filtered_bam folder contains the soft link to these bam files. In the first run, I just run according to the command.sh asked. There is no fastq files to be worked for DNA-mapping run in this test data.

Also, I tried with -d for the ./(where I extract test data) and --fromBAM for the filtered_bam(where the extracted data with soft links of bam files) from the second run, I did specified the output folder with '-d' which is the upper level directory of the bam file folder.

-d, --working-dir

working directory is output directory and must contain DNA-mapping pipeline output files

Do you mean I should change the -d to another folder? OK, I'll test that.

84 of 84 steps (100%) done Complete log: .snakemake/log/2023-08-02T105846.019871.snakemake.log

---- This analysis has been done using snakePipes version 2.7.3 ---- Sample sheet found and header is ok! Building DAG of jobs...

It seems this works fine. thank you.

katsikora commented 1 year ago

Oh I see. I'll have a look if either the test data or the command.sh script on zenodo should be updated to reflect the latest workflow version. Thanks for clarifying that!

arrowifjn commented 11 months ago

ATAC-seq -d . --sampleSheet sampleSheet.tsv --local --DAG dm6 --fromBAM Bowtie2 --bamExt .bam