ENCODE-DCC / chip-seq-pipeline2

ENCODE ChIP-seq pipeline
MIT License
241 stars 123 forks source link

Run gives error #266

Closed shirleytemples closed 2 years ago

shirleytemples commented 2 years ago

Describe the bug

I have been able to successfully run the chip-seq pipeline for the example json with a variety of different parameters. While passing in a custom .bam file I running into errors that I am not too sure how to debug.

Caper configuration file

Locally run

Input JSON file

Paste contents of your input JSON file.

{
    "chip.title" : "Running all-H3K4me3.bam through ENCODE",
    "chip.description" : "This is an template input JSON for all-POLR2A.",

    "chip.pipeline_type" : "histone",
    "chip.nodup_bams" : ["/mnt/c/Users/shirl/Desktop/guttman_lab/all-H3K4me3.bam"],
    "chip.genome_tsv" : "https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v4/hg38_chr19_chrM.tsv",
    "chip.enable_gc_bias" : false,
    "chip.paired_end" : true
}

Troubleshooting result

Paste troubleshooting result.

`ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/6b12d05b-3e09-49dd-82c2-0edd3964b413/call-xcor/shard-0/attempt-2/execution/*.cc.plot.png': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/6b12d05b-3e09-49dd-82c2-0edd3964b413/call-xcor/shard-0/attempt-2/execution/*.cc.fraglen.txt': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/6b12d05b-3e09-49dd-82c2-0edd3964b413/call-xcor/shard-0/attempt-2/execution/*.cc.plot.pdf': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/6b12d05b-3e09-49dd-82c2-0edd3964b413/call-xcor/shard-0/attempt-2/execution/*.cc.qc': No such file or directory
leepc12 commented 2 years ago

Did you run with --conda?

shirleytemples commented 2 years ago

Yeah, I ran:

caper run chip.wdl -i /mnt/c/users/shirl/Desktop/guttman_lab/test_nodup.json --conda

leepc12 commented 2 years ago

It's a duplicate issue: https://github.com/ENCODE-DCC/chip-seq-pipeline2/issues/265#issuecomment-1050183228

I will fix this in the next release. Until then please manually install ncurses 5.

$ conda install -n encode-chip-seq-pipeline-spp --no-deps --no-update-deps -y ncurses==5.9 -c conda-forge
shirleytemples commented 2 years ago

Describe the bug

I ran the following command and downloaded ncurses-5.9 conda install -n encode-chip-seq-pipeline-spp --no-deps --no-update-deps -y ncurses==5.9 -c conda-forge and then re-run the custom .bam file caper run chip.wdl -i /mnt/c/users/shirl/Desktop/guttman_lab/test_nodup.json --conda However, I ran into some other errors that I am not too sure how to debug.

Caper configuration file

Locally run

Input JSON file

Paste contents of your input JSON file.

{
    "chip.title" : "Running all-H3K4me3.bam through ENCODE",
    "chip.description" : "This is an template input JSON for all-POLR2A.",

    "chip.pipeline_type" : "histone",
    "chip.nodup_bams" : ["/mnt/c/Users/shirl/Desktop/guttman_lab/all-H3K4me3.bam"],
    "chip.genome_tsv" : "https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v4/hg38_chr19_chrM.tsv",
    "chip.enable_gc_bias" : false,
    "chip.paired_end" : true
}

Troubleshooting result

Paste troubleshooting result.

2022-02-24 16:14:18,897|caper.cromwell_workflow_monitor|INFO| Task: id=3938a029-6122-4060-8dbf-c7d3d042554c, task=chip.read_genome_tsv:-1, retry=0, status=Started, job_id=25222
2022-02-24 16:14:18,903|caper.cromwell_workflow_monitor|INFO| Task: id=3938a029-6122-4060-8dbf-c7d3d042554c, task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode
2022-02-24 16:14:27,285|caper.cromwell_workflow_monitor|INFO| Task: id=3938a029-6122-4060-8dbf-c7d3d042554c, task=chip.read_genome_tsv:-1, retry=0, status=Done
2022-02-24 16:14:35,816|caper.cromwell_workflow_monitor|INFO| Workflow: id=3938a029-6122-4060-8dbf-c7d3d042554c, status=Failed
2022-02-24 16:14:44,856|caper.cromwell_metadata|INFO| Wrote metadata file. /home/shirleytemples/chip-seq-pipeline2/chip/3938a029-6122-4060-8dbf-c7d3d042554c/metadata.json
2022-02-24 16:14:44,857|caper.cromwell|INFO| Workflow failed. Auto-troubleshooting...
* Started troubleshooting workflow: id=3938a029-6122-4060-8dbf-c7d3d042554c, status=Failed
* Found failures JSON object.
[
    {
        "message": "Workflow failed",
        "causedBy": [
            {
                "message": "Call input and runtime attributes evaluation failed for bam2ta",
                "causedBy": [
                    {
                        "causedBy": [],
                        "message": "Failed to evaluate input 'samtools_mem_gb' (reason 1 of 1): ValueEvaluator[IdentifierLookup]: No suitable input for 'mem_gb' amongst {runtime_environment, mem_factor, subsample, time_hr, mito_chr_name, disk_factor, cpu, paired_end, bam}"
                    },
                    {
                        "causedBy": [],
                        "message": "Failed to evaluate input 'disk_gb' (reason 1 of 1): ValueEvaluator[IdentifierLookup]: No suitable input for 'input_file_size_gb' amongst {runtime_environment, mem_factor, subsample, time_hr, mito_chr_name, disk_factor, cpu, paired_end, bam}"
                    },
                    {
                        "causedBy": [],
                        "message": "Failed to evaluate input 'mem_gb' (reason 1 of 1): ValueEvaluator[IdentifierLookup]: No suitable input for 'input_file_size_gb' amongst {runtime_environment, mem_factor, subsample, time_hr, mito_chr_name, disk_factor, cpu, paired_end, bam}"
                    },
                    {
                        "causedBy": [],
                        "message": "Failed to evaluate input 'input_file_size_gb' (reason 1 of 1): [Attempted 1 time(s)] - NoSuchFileException: /mnt/c/Users/shirl/Desktop/guttman_lab/bam/all-H3K4me3.bam"
                    }
                ]
            }
        ]
    }
]
* Recursively finding failures in calls (tasks)...

==== NAME=chip.bam2ta, STATUS=Failed, PARENT=
SHARD_IDX=0, RC=None, JOB_ID=None
START=None, END=None
STDOUT=None
STDERR=None
2022-02-24 16:14:44,878|caper.nb_subproc_thread|ERROR| Cromwell failed. returncode=1
2022-02-24 16:14:44,879|caper.cli|ERROR| Check stdout in /home/shirleytemples/chip-seq-pipeline2/cromwell.out.49

Thank you so much for your help, any suggestions are much appreciated!

leepc12 commented 2 years ago

This looks like version mismatch between pipeline and its Conda env. Please run head chip.wdl to find the version of the pipeline and make sure that you have the latest pipeline and a Conda environment installed from the installer script (scripts/install_conda_env.sh) included in the latest pipeline.

shirleytemples commented 2 years ago

I upgraded to the latest pipeline released 8 hours ago as well as re installed the environment. I am yet running into the same error as mentioned previously. Any thoughts on what I can do to get the pipeline running from the bam file. For some reason it is failing the xcor step.

` Wrote metadata file. /home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/metadata.json
2022-03-01 00:47:22,710|caper.cromwell|INFO| Workflow failed. Auto-troubleshooting...
* Started troubleshooting workflow: id=4ed3c212-f225-4916-928a-8f87aa9e8e67, status=Failed
* Found failures JSON object.
[
    {
        "message": "Workflow failed",
        "causedBy": [
            {
                "causedBy": [],
                "message": "Job chip.spr:0:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details."
            },
            {
                "causedBy": [],
                "message": "Job chip.xcor:0:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details."
            }
        ]
    }
]
* Recursively finding failures in calls (tasks)...

==== NAME=chip.spr, STATUS=RetryableFailure, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=3176
START=2022-03-01T08:45:50.749Z, END=2022-03-01T08:46:01.557Z
STDOUT=/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-spr/shard-0/execution/stdout
STDERR=/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-spr/shard-0/execution/stderr
STDERR_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=3230, PGID=3230, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

STDERR_BACKGROUND_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=3230, PGID=3230, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

==== NAME=chip.spr, STATUS=Failed, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=3348
START=2022-03-01T08:46:02.751Z, END=2022-03-01T08:46:10.498Z
STDOUT=/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-spr/shard-0/attempt-2/execution/stdout
STDERR=/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-spr/shard-0/attempt-2/execution/stderr
STDERR_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=3424, PGID=3424, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

STDERR_BACKGROUND_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=3424, PGID=3424, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

==== NAME=chip.xcor, STATUS=RetryableFailure, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=3241
START=2022-03-01T08:45:54.743Z, END=2022-03-01T08:46:06.536Z
STDOUT=/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/execution/stdout
STDERR=/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/execution/stderr
STDERR_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 156, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 144, in main
    args.chip_seq_type, args.exclusion_range_min, args.exclusion_range_max)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 105, in xcor
    run_shell_cmd(cmd1)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=3299, PGID=3299, RC=1, DURATION_SEC=2.0
STDERR=Loading required package: caTools
Warning message:
In max(ccl.av$x[ccl.av$y >= th]) :
  no non-missing arguments to max; returning -Inf
Error in if ((crosscorr$rel.phantom.coeff >= 0) & (crosscorr$rel.phantom.coeff <  :
  argument is of length zero
Execution halted
STDOUT=################
ChIP data: all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
Control data: NA
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): -500
exclusion(max): 101
num parallel nodes: 2
FDR threshold: 0.01
NumPeaks Threshold: NA
Output Directory: .
narrowPeak output file name: NA
regionPeak output file name: NA
Rdata filename: NA
plot pdf filename: all-H3K4me3.no_chrM.R1.15M.cc.plot.pdf
result filename: all-H3K4me3.no_chrM.R1.15M.cc.qc
Overwrite files?: TRUE

Decompressing ChIP file
Reading ChIP tagAlign/BAM file all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
opened /home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/tmp.1fc425cd/RtmpfQN7BP/all-H3K4me3.no_chrM.R1.15M.tagAlignce55b19f03a
done. read 1 fragments
ChIP data read length 91
[1] TRUE
Calculating peak characteristics
Minimum cross-correlation value NaN
Minimum cross-correlation shift 1500
Top 3 cross-correlation values NA
Top 3 estimates for fragment length NA
Window half size NA
Phantom peak location
Phantom peak Correlation
Normalized Strand cross-correlation coefficient (NSC) NA
Relative Strand cross-correlation Coefficient (RSC)

STDERR_BACKGROUND_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 156, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 144, in main
    args.chip_seq_type, args.exclusion_range_min, args.exclusion_range_max)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 105, in xcor
    run_shell_cmd(cmd1)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=3299, PGID=3299, RC=1, DURATION_SEC=2.0
STDERR=Loading required package: caTools
Warning message:
In max(ccl.av$x[ccl.av$y >= th]) :
  no non-missing arguments to max; returning -Inf
Error in if ((crosscorr$rel.phantom.coeff >= 0) & (crosscorr$rel.phantom.coeff <  :
  argument is of length zero
Execution halted
STDOUT=################
ChIP data: all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
Control data: NA
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): -500
exclusion(max): 101
num parallel nodes: 2
FDR threshold: 0.01
NumPeaks Threshold: NA
Output Directory: .
narrowPeak output file name: NA
regionPeak output file name: NA
Rdata filename: NA
plot pdf filename: all-H3K4me3.no_chrM.R1.15M.cc.plot.pdf
result filename: all-H3K4me3.no_chrM.R1.15M.cc.qc
Overwrite files?: TRUE

Decompressing ChIP file
Reading ChIP tagAlign/BAM file all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
opened /home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/tmp.1fc425cd/RtmpfQN7BP/all-H3K4me3.no_chrM.R1.15M.tagAlignce55b19f03a
done. read 1 fragments
ChIP data read length 91
[1] TRUE
Calculating peak characteristics
Minimum cross-correlation value NaN
Minimum cross-correlation shift 1500
Top 3 cross-correlation values NA
Top 3 estimates for fragment length NA
Window half size NA
Phantom peak location
Phantom peak Correlation
Normalized Strand cross-correlation coefficient (NSC) NA
Relative Strand cross-correlation Coefficient (RSC)
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/execution/*.cc.plot.png': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/execution/*.cc.fraglen.txt': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/execution/*.cc.plot.pdf': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-xcor/shard-0/execution/*.cc.qc': No such file or directory
`
leepc12 commented 2 years ago

In your input JSON, I found hg38_chr19_chrM.tsv. It's chr19 only reference data for testing purpose only. Can you try with a full chromosome reference data https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v4/hg38.tsv?

{
    "chip.title" : "Running all-H3K4me3.bam through ENCODE",
    "chip.description" : "This is an template input JSON for all-POLR2A.",

    "chip.pipeline_type" : "histone",
    "chip.nodup_bams" : ["/mnt/c/Users/shirl/Desktop/guttman_lab/all-H3K4me3.bam"],
    "chip.genome_tsv" : "https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v4/hg38_chr19_chrM.tsv",
    "chip.enable_gc_bias" : false,
    "chip.paired_end" : true
}

Also, please run this for debugging.

$ ls -l /home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-spr/shard-0/attempt-2/execution
shirleytemples commented 2 years ago

Hi @leepc12 thanks so much for your suggestions! I am still running into the same error(see below) even after changing my reference genome to the full chromosome data in the input JSON file.

Here is what shows after I run the debugging:

(base) shirleytemples@DESKTOP-9R1GDM5:~/chip-seq-pipeline2$ ls -l /home/shirleytemples/chip-seq-pipeline2/chip/4ed3c212-f225-4916-928a-8f87aa9e8e67/call-spr/shard-0/attempt-2/execution
total 32
-rw-r--r-- 1 shirleytemples shirleytemples   66 Mar  1 00:46 all-H3K4me3.00
-rw-r--r-- 2 shirleytemples shirleytemples   51 Mar  1 00:46 all-H3K4me3.pr1.tagAlign.gz
-rw-r--r-- 2 shirleytemples shirleytemples   20 Mar  1 00:46 all-H3K4me3.pr2.tagAlign.gz
-rw-r--r-- 1 shirleytemples shirleytemples    5 Mar  1 00:46 docker_cid
-rw-r--r-- 1 shirleytemples shirleytemples    0 Mar  1 00:46 docker_cid.not_docker
drwxr-xr-x 1 shirleytemples shirleytemples 4096 Mar  1 00:46 glob-478c0ad30d0d033ce59a75ef84dab32e
-rw-r--r-- 1 shirleytemples shirleytemples   28 Mar  1 00:46 glob-478c0ad30d0d033ce59a75ef84dab32e.list
drwxr-xr-x 1 shirleytemples shirleytemples 4096 Mar  1 00:46 glob-a7cc663e5a8a49cc3d9bc036f4370f1c
-rw-r--r-- 1 shirleytemples shirleytemples   28 Mar  1 00:46 glob-a7cc663e5a8a49cc3d9bc036f4370f1c.list
-rw-r--r-- 1 shirleytemples shirleytemples    2 Mar  1 00:46 rc
-rw-r--r-- 1 shirleytemples shirleytemples 6510 Mar  1 00:46 script
-rw-r--r-- 1 shirleytemples shirleytemples  375 Mar  1 00:46 script.background
-rw-r--r-- 1 shirleytemples shirleytemples 5046 Mar  1 00:46 script.submit
-rw-r--r-- 1 shirleytemples shirleytemples  747 Mar  1 00:46 stderr
-rw-r--r-- 1 shirleytemples shirleytemples  748 Mar  1 00:46 stderr.background
-rw-r--r-- 1 shirleytemples shirleytemples 2300 Mar  1 00:46 stdout
-rw-r--r-- 1 shirleytemples shirleytemples 2306 Mar  1 00:46 stdout.background

ERROR MESSAGE:

2022-03-01 23:23:42,099|caper.cromwell_workflow_monitor|INFO| Task: id=8c5a7ef0-344f-481a-929f-f6ad5c4fc22d, task=chip.jsd:-1, retry=0, status=Done
2022-03-01 23:23:42,924|caper.cromwell_workflow_monitor|INFO| Workflow: id=8c5a7ef0-344f-481a-929f-f6ad5c4fc22d, status=Failed
2022-03-01 23:24:55,006|caper.cromwell_metadata|INFO| Wrote metadata file. /home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/metadata.json
2022-03-01 23:24:55,007|caper.cromwell|INFO| Workflow failed. Auto-troubleshooting...
* Started troubleshooting workflow: id=8c5a7ef0-344f-481a-929f-f6ad5c4fc22d, status=Failed
* Found failures JSON object.
[
    {
        "message": "Workflow failed",
        "causedBy": [
            {
                "causedBy": [],
                "message": "Job chip.spr:0:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details."
            },
            {
                "causedBy": [],
                "message": "Job chip.xcor:0:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details."
            }
        ]
    }
]
* Recursively finding failures in calls (tasks)...

==== NAME=chip.spr, STATUS=RetryableFailure, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=4094
START=2022-03-02T07:23:06.741Z, END=2022-03-02T07:23:18.445Z
STDOUT=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-spr/shard-0/execution/stdout
STDERR=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-spr/shard-0/execution/stderr
STDERR_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4163, PGID=4163, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

STDERR_BACKGROUND_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4163, PGID=4163, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

==== NAME=chip.spr, STATUS=Failed, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=4289
START=2022-03-02T07:23:20.729Z, END=2022-03-02T07:23:28.383Z
STDOUT=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-spr/shard-0/attempt-2/execution/stdout
STDERR=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-spr/shard-0/attempt-2/execution/stderr
STDERR_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4343, PGID=4343, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

STDERR_BACKGROUND_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 176, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 158, in main
    args.ta, args.pseudoreplication_random_seed, args.out_dir,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spr.py", line 141, in spr_pe
    ta_pr2=ta_pr2,
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4343, PGID=4343, RC=1, DURATION_SEC=0.0
STDERR=gzip: all-H3K4me3.01.gz: No such file or directory
STDOUT=

==== NAME=chip.xcor, STATUS=RetryableFailure, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=4132
START=2022-03-02T07:23:08.737Z, END=2022-03-02T07:23:23.414Z
STDOUT=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/execution/stdout
STDERR=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/execution/stderr
STDERR_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 156, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 144, in main
    args.chip_seq_type, args.exclusion_range_min, args.exclusion_range_max)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 105, in xcor
    run_shell_cmd(cmd1)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4218, PGID=4218, RC=1, DURATION_SEC=2.2
STDERR=Loading required package: caTools
Warning message:
In max(ccl.av$x[ccl.av$y >= th]) :
  no non-missing arguments to max; returning -Inf
Error in if ((crosscorr$rel.phantom.coeff >= 0) & (crosscorr$rel.phantom.coeff <  :
  argument is of length zero
Execution halted
STDOUT=################
ChIP data: all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
Control data: NA
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): -500
exclusion(max): 101
num parallel nodes: 2
FDR threshold: 0.01
NumPeaks Threshold: NA
Output Directory: .
narrowPeak output file name: NA
regionPeak output file name: NA
Rdata filename: NA
plot pdf filename: all-H3K4me3.no_chrM.R1.15M.cc.plot.pdf
result filename: all-H3K4me3.no_chrM.R1.15M.cc.qc
Overwrite files?: TRUE

Decompressing ChIP file
Reading ChIP tagAlign/BAM file all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
opened /home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/tmp.ae30a8d2/RtmpXjdfFg/all-H3K4me3.no_chrM.R1.15M.tagAlign107c3bea8c01
done. read 1 fragments
ChIP data read length 91
[1] TRUE
Calculating peak characteristics
Minimum cross-correlation value NaN
Minimum cross-correlation shift 1500
Top 3 cross-correlation values NA
Top 3 estimates for fragment length NA
Window half size NA
Phantom peak location
Phantom peak Correlation
Normalized Strand cross-correlation coefficient (NSC) NA
Relative Strand cross-correlation Coefficient (RSC)

STDERR_BACKGROUND_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 156, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 144, in main
    args.chip_seq_type, args.exclusion_range_min, args.exclusion_range_max)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 105, in xcor
    run_shell_cmd(cmd1)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4218, PGID=4218, RC=1, DURATION_SEC=2.2
STDERR=Loading required package: caTools
Warning message:
In max(ccl.av$x[ccl.av$y >= th]) :
  no non-missing arguments to max; returning -Inf
Error in if ((crosscorr$rel.phantom.coeff >= 0) & (crosscorr$rel.phantom.coeff <  :
  argument is of length zero
Execution halted
STDOUT=################
ChIP data: all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
Control data: NA
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): -500
exclusion(max): 101
num parallel nodes: 2
FDR threshold: 0.01
NumPeaks Threshold: NA
Output Directory: .
narrowPeak output file name: NA
regionPeak output file name: NA
Rdata filename: NA
plot pdf filename: all-H3K4me3.no_chrM.R1.15M.cc.plot.pdf
result filename: all-H3K4me3.no_chrM.R1.15M.cc.qc
Overwrite files?: TRUE

Decompressing ChIP file
Reading ChIP tagAlign/BAM file all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
opened /home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/tmp.ae30a8d2/RtmpXjdfFg/all-H3K4me3.no_chrM.R1.15M.tagAlign107c3bea8c01
done. read 1 fragments
ChIP data read length 91
[1] TRUE
Calculating peak characteristics
Minimum cross-correlation value NaN
Minimum cross-correlation shift 1500
Top 3 cross-correlation values NA
Top 3 estimates for fragment length NA
Window half size NA
Phantom peak location
Phantom peak Correlation
Normalized Strand cross-correlation coefficient (NSC) NA
Relative Strand cross-correlation Coefficient (RSC)
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/execution/*.cc.plot.png': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/execution/*.cc.fraglen.txt': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/execution/*.cc.plot.pdf': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/execution/*.cc.qc': No such file or directory

==== NAME=chip.xcor, STATUS=Failed, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=4358
START=2022-03-02T07:23:24.730Z, END=2022-03-02T07:23:34.358Z
STDOUT=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/execution/stdout
STDERR=/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/execution/stderr
STDERR_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 156, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 144, in main
    args.chip_seq_type, args.exclusion_range_min, args.exclusion_range_max)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 105, in xcor
    run_shell_cmd(cmd1)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4416, PGID=4416, RC=1, DURATION_SEC=1.5
STDERR=Loading required package: caTools
Warning message:
In max(ccl.av$x[ccl.av$y >= th]) :
  no non-missing arguments to max; returning -Inf
Error in if ((crosscorr$rel.phantom.coeff >= 0) & (crosscorr$rel.phantom.coeff <  :
  argument is of length zero
Execution halted
STDOUT=################
ChIP data: all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
Control data: NA
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): -500
exclusion(max): 101
num parallel nodes: 2
FDR threshold: 0.01
NumPeaks Threshold: NA
Output Directory: .
narrowPeak output file name: NA
regionPeak output file name: NA
Rdata filename: NA
plot pdf filename: all-H3K4me3.no_chrM.R1.15M.cc.plot.pdf
result filename: all-H3K4me3.no_chrM.R1.15M.cc.qc
Overwrite files?: TRUE

Decompressing ChIP file
Reading ChIP tagAlign/BAM file all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
opened /home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/tmp.31afd7dc/RtmpGdQ9Hg/all-H3K4me3.no_chrM.R1.15M.tagAlign11423c791871
done. read 1 fragments
ChIP data read length 91
[1] TRUE
Calculating peak characteristics
Minimum cross-correlation value NaN
Minimum cross-correlation shift 1500
Top 3 cross-correlation values NA
Top 3 estimates for fragment length NA
Window half size NA
Phantom peak location
Phantom peak Correlation
Normalized Strand cross-correlation coefficient (NSC) NA
Relative Strand cross-correlation Coefficient (RSC)

STDERR_BACKGROUND_CONTENTS=
Traceback (most recent call last):
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 156, in <module>
    main()
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 144, in main
    args.chip_seq_type, args.exclusion_range_min, args.exclusion_range_max)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_task_xcor.py", line 105, in xcor
    run_shell_cmd(cmd1)
  File "/home/shirleytemples/miniconda3/envs/encode-chip-seq-pipeline-spp/bin/encode_lib_common.py", line 359, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=4416, PGID=4416, RC=1, DURATION_SEC=1.5
STDERR=Loading required package: caTools
Warning message:
In max(ccl.av$x[ccl.av$y >= th]) :
  no non-missing arguments to max; returning -Inf
Error in if ((crosscorr$rel.phantom.coeff >= 0) & (crosscorr$rel.phantom.coeff <  :
  argument is of length zero
Execution halted
STDOUT=################
ChIP data: all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
Control data: NA
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): -500
exclusion(max): 101
num parallel nodes: 2
FDR threshold: 0.01
NumPeaks Threshold: NA
Output Directory: .
narrowPeak output file name: NA
regionPeak output file name: NA
Rdata filename: NA
plot pdf filename: all-H3K4me3.no_chrM.R1.15M.cc.plot.pdf
result filename: all-H3K4me3.no_chrM.R1.15M.cc.qc
Overwrite files?: TRUE

Decompressing ChIP file
Reading ChIP tagAlign/BAM file all-H3K4me3.no_chrM.R1.15M.tagAlign.gz
opened /home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/tmp.31afd7dc/RtmpGdQ9Hg/all-H3K4me3.no_chrM.R1.15M.tagAlign11423c791871
done. read 1 fragments
ChIP data read length 91
[1] TRUE
Calculating peak characteristics
Minimum cross-correlation value NaN
Minimum cross-correlation shift 1500
Top 3 cross-correlation values NA
Top 3 estimates for fragment length NA
Window half size NA
Phantom peak location
Phantom peak Correlation
Normalized Strand cross-correlation coefficient (NSC) NA
Relative Strand cross-correlation Coefficient (RSC)
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/execution/*.cc.plot.png': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/execution/*.cc.fraglen.txt': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/execution/*.cc.plot.pdf': No such file or directory
ln: failed to access '/home/shirleytemples/chip-seq-pipeline2/chip/8c5a7ef0-344f-481a-929f-f6ad5c4fc22d/call-xcor/shard-0/attempt-2/execution/*.cc.qc': No such file or directory

2022-03-01 23:24:55,422|caper.nb_subproc_thread|ERROR| Cromwell failed. returncode=1
2022-03-01 23:24:55,423|caper.cli|ERROR| Check stdout in /home/shirleytemples/chip-seq-pipeline2/cromwell.out.55`
leepc12 commented 2 years ago

Can you samtools flagstat on your BAM file? It looks like your BAM is corrupted or low quality? I found that TAGALIGN BED converted from BAM is empty.

-rw-r--r-- 2 shirleytemples shirleytemples   20 Mar  1 00:46 all-H3K4me3.pr2.tagAlign.gz
shirleytemples commented 2 years ago

Yep, I tried samtools flagstat on my BAM file, it is single-ended:

36234385 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
36234385 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
leepc12 commented 2 years ago

Your input JSON says

"chip.paired_end" : true

Please change it to false and try again.

shirleytemples commented 2 years ago

Good catch, after changing "chip.paired_end" : false, I was able to run the pipeline successfully! Thank you again for all your help!