maxplanck-ie / snakepipes

Customizable workflows based on snakemake and python for the analysis of NGS data
http://snakepipes.readthedocs.io
389 stars 88 forks source link

ATAC example miss /bamCoverage/*.filtered.seq_depth_norm.bw files #910

Open HamletShaoE opened 1 year ago

HamletShaoE commented 1 year ago

I have installed snakepipes and downloaded ATACseq example for test run. However, the example does not run as I expected. I first run following the suggestion from command.sh and get this:

(snakePipes) shaoyi@login02 ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG dm6_ensembl_release94 Sample sheet found and header is ok!

---- This analysis has been done using snakePipes version 2.7.3 ---- Sample sheet found and header is ok! SystemExit in line 33 of /GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile: 1 File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/Snakefile", line 26, in File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile", line 33, in File "", line 26, in call ERROR: Required file "/GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/bamCoverage/SRR7013050.filtered.seq_depth_norm.bw" for sample "SRR7013050" specified in configuration file is NOT available. Error: snakemake returned an error code of 1, so processing is incomplete!

I suppose it is because bamCoverage/SRR7013050.filtered.seq_depth_norm.bw is missing. I notice files in filtered_bam folder are links to Bowtie2. So I run with flag --fromBAM. This runs for a while and slurm goes well, yet end with error in the step of link_bam_bai_external and its' error message is

(snakePipes) shaoyi@login01 ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG --fromBAM filtered_bam dm6_ensembl_release94 ... [Thu Jul 27 11:25:09 2023] Error in rule link_bam_bai_external: jobid: 10 input: EXTERNAL_BAM/SRR7013046.bam, EXTERNAL_BAM/SRR7013046.bam.bai output: filtered_bam/SRR7013046.filtered.bam, filtered_bam/SRR7013046.filtered.bam.bai shell:

            ln -s ../EXTERNAL_BAM/SRR7013046.bam filtered_bam/SRR7013046.filtered.bam;
            ln -s ../EXTERNAL_BAM/SRR7013046.bam.bai filtered_bam/SRR7013046.filtered.bam.bai

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 5652530

Error executing rule link_bam_bai_external on cluster (jobid: 10, external: Submitted batch job 5652530, jobscript: /GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/.snakemake/tmp.onwy0y99/snakejob.link_bam_bai_external.10.sh). For error details see the cluster log and the log files of the involved rule(s). Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-07-27T110334.970475.snakemake.log

!!! ERROR in the ATAC-seq open chromatin workflow! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Error: snakemake returned an error code of 1, so processing is incomplete!

I notice that EXTERNAL_BAM are links to files in filtered_bam, and this command is trying to link files in filtered_bam back to EXTERNAL_BAM.

I would appreciate if *.filtered.seq_depth_norm.bw files are provided. Or any other suggestions that would help me go through the test.

NixBio commented 1 year ago

Thank you for your message. I am on a conference till July 28, 2023. I will answer your email once I am back in my office.

In urgent cases, please, contact genomics-core(at)rcii.de

Kind Regards, Nicholas Strieder

-- Dr. rer. nat. Nicholas Strieder ~~

Leibniz-Institut für Immuntherapie - LIT NGS Core - Bininformatics Universitätsklinikum Regensburg Franz-Josef-Strauß-Allee 11 93053 Regensburg Germany

Phone: ++49 (0)941 944 18188 E-mail: @.***

HamletShaoE @.***> 27.7.23 07:20 >>>

I have installed snakepipes and downloaded ATACseq example for test run. However, the example does not run as I expected. I first run following the suggestion from command.sh and get this:

(snakePipes) @.*** ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG dm6_ensembl_release94 Sample sheet found and header is ok!

---- This analysis has been done using snakePipes version 2.7.3 ---- Sample sheet found and header is ok! SystemExit in line 33 of /GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile: 1 File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/Snakefile", line 26, in File "/GPFS/zhangli_lab_permanent/shaoyi/software/mambaforge/envs/snakePipes/lib/python3.11/site-packages/snakePipes/workflows/ATAC-seq/internals.snakefile", line 33, in File "", line 26, in call ERROR: Required file "/GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/bamCoverage/SRR7013050.filtered.seq_depth_norm.bw" for sample "SRR7013050" specified in configuration file is NOT available. Error: snakemake returned an error code of 1, so processing is incomplete!

I suppose it is because bamCoverage/SRR7013050.filtered.seq_depth_norm.bw is missing. I notice files in filtered_bam folder are links to Bowtie2. So I run with flag --fromBAM. This runs for a while and slurm goes well, yet end with error in the step of link_bam_bai_external and its' error message is

(snakePipes) @.*** ~/DATA/snakePipe/testData/ATAC $ATAC-seq -d . --sampleSheet sampleSheet.tsv --DAG --fromBAM filtered_bam dm6_ensembl_release94 ... [Thu Jul 27 11:25:09 2023] Error in rule link_bam_bai_external: jobid: 10 input: EXTERNAL_BAM/SRR7013046.bam, EXTERNAL_BAM/SRR7013046.bam.bai output: filtered_bam/SRR7013046.filtered.bam, filtered_bam/SRR7013046.filtered.bam.bai shell:

            ln -s ../EXTERNAL_BAM/SRR7013046.bam filtered_bam/SRR7013046.filtered.bam;
            ln -s ../EXTERNAL_BAM/SRR7013046.bam.bai filtered_bam/SRR7013046.filtered.bam.bai

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 5652530

Error executing rule link_bam_bai_external on cluster (jobid: 10, external: Submitted batch job 5652530, jobscript: /GPFS/zhangli_lab_permanent/shaoyi/snakePipe/testData/ATAC/.snakemake/tmp.onwy0y99/snakejob.link_bam_bai_external.10.sh). For error details see the cluster log and the log files of the involved rule(s). Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-07-27T110334.970475.snakemake.log

!!! ERROR in the ATAC-seq open chromatin workflow! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Error: snakemake returned an error code of 1, so processing is incomplete!

I notice that EXTERNAL_BAM are links to files in filtered_bam, and this command is trying to link files in filtered_bam back to EXTERNAL_BAM.

I would appreciate if *.filtered.seq_depth_norm.bw files are provided. Or any other suggestions that would help me go through the test.

-- Reply to this email directly or view it on GitHub: https://github.com/maxplanck-ie/snakepipes/issues/910 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

katsikora commented 1 year ago

Hi,

did your DNA-mapping run prior to running the ATAC-seq workflow in it produce any errors?

Second: when you run ATAC-seq with --fromBAM, did you specify a new output folder, other than that hosting the bam files?

Best wishes,

Katarzyna

HamletShaoE commented 1 year ago

Hi,

did your DNA-mapping run prior to running the ATAC-seq workflow in it produce any errors?

Second: when you run ATAC-seq with --fromBAM, did you specify a new output folder, other than that hosting the bam files?

Best wishes,

Katarzyna

Hi, thank you for replying. I download the test data listed in the [snakePipes page](https://snakepipes.readthedocs.io/en/latest/content/setting_up.html) for setting up. The data I downloaded named "ATACseq.tar.gz", whose md5 is 7c822de19958d69bcebe22b1d48885a3. In this file, only bam files contained in Bowtie2 folder and some bamPEFragments contained in deepTools_qc folder. filtered_bam folder contains the soft link to these bam files. In the first run, I just run according to the command.sh asked. There is no fastq files to be worked for DNA-mapping run in this test data.

Also, I tried with -d for the ./(where I extract test data) and --fromBAM for the filtered_bam(where the extracted data with soft links of bam files) from the second run, I did specified the output folder with '-d' which is the upper level directory of the bam file folder.

-d, --working-dir

working directory is output directory and must contain DNA-mapping pipeline output files

Do you mean I should change the -d to another folder? OK, I'll test that.

84 of 84 steps (100%) done Complete log: .snakemake/log/2023-08-02T105846.019871.snakemake.log

---- This analysis has been done using snakePipes version 2.7.3 ---- Sample sheet found and header is ok! Building DAG of jobs...

It seems this works fine. thank you.

katsikora commented 1 year ago

Oh I see. I'll have a look if either the test data or the command.sh script on zenodo should be updated to reflect the latest workflow version. Thanks for clarifying that!

arrowifjn commented 1 year ago

ATAC-seq -d . --sampleSheet sampleSheet.tsv --local --DAG dm6 --fromBAM Bowtie2 --bamExt .bam