ENCODE-DCC / chip-seq-pipeline2

ENCODE ChIP-seq pipeline
MIT License
233 stars 123 forks source link

Pipeline hangs on: "undefined symbol: JLI_InitArgProcessing" and "task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode" #225

Open neekonsu opened 3 years ago

neekonsu commented 3 years ago

Describe the bug

When running the pipeline on SCG, it consistently hangs on the line that reads "task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode". I have changed the JDK version to different suggested versions and tried running the pipeline with and without an active server, and I cannot overcome this error.

OS/Platform

Caper configuration file

Paste contents of ~/.caper/default.conf.


backend=slurm
slurm-account=default
# Hashing strategy for call-caching (3 choices)
# This parameter is for local (local/slurm/sge/pbs) backend only.
# This is important for call-caching,
# which means re-using outputs from previous/failed workflows.
# Cache will miss if different strategy is used.
# "file" method has been default for all old versions of Caper<1.0.
# "path+modtime" is a new default for Caper>=1.0,
#   file: use md5sum hash (slow).
#   path: use path.
#   path+modtime: use path and modification time.
local-hash-strat=path+modtime
# Local directory for localized files and Cromwell's intermediate files
# If not defined, Caper will make .caper_tmp/ on local-out-dir or CWD.
# /tmp is not recommended here since Caper store all localized data files
# on this directory (e.g. input FASTQs defined as URLs in input JSON).
local-loc-dir=/labs/mpsnyder/neekonsu/2021/workspace/abc/CAPER
cromwell=/home/neekonsu/.caper/cromwell_jar/cromwell-52.jar
womtool=/home/neekonsu/.caper/womtool_jar/womtool-52.jar

Input JSON file

Paste contents of your input JSON file.

{
    "chip.title" : "Paired End Preprocessing",
    "chip.description" : "Pipeline based on template JSON",
    "chip.pipeline_type" : "histone",
    "chip.aligner" : "bowtie2",
    "chip.align_only" : false,
    "chip.true_rep_only" : false,
    "chip.genome_tsv" : "https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v3/hg38.tsv",
    "chip.paired_end" : true,
    "chip.ctl_paired_end" : false,
    "chip.fastqs_rep1_R1" : [ "/labs/mpsnyder/neekonsu/2021/workspace/abc/SRR9736862.1.fastq" ],
    "chip.fastqs_rep1_R2" : [ "/labs/mpsnyder/neekonsu/2021/workspace/abc/SRR9736862.2.fastq" ],
    "chip.fastqs_rep2_R1" : [ "/labs/mpsnyder/neekonsu/2021/workspace/abc/SRR9736863.1.fastq" ],
    "chip.fastqs_rep2_R2" : [ "/labs/mpsnyder/neekonsu/2021/workspace/abc/SRR9736863.2.fastq" ]
}

Troubleshooting result

If you ran caper run without Caper server then Caper automatically runs a troubleshooter for failed workflows. Find troubleshooting result in the bottom of Caper's screen log.

If you ran caper submit with a running Caper server then first find your workflow ID (1st column) with caper list and run caper debug [WORKFLOW_ID].

Paste troubleshooting result.

Pipeline Output usual error

2021-03-07 16:23:29,791|caper.cromwell|INFO| Validating WDL/inputs/imports with Womtool...
2021-03-07 16:23:33,845|caper.cromwell|INFO| Womtool validation passed.
2021-03-07 16:23:33,846|caper.caper_runner|INFO| launching run: wdl=/labs/mpsnyder/neekonsu/2021/workspace/abc/chip-seq-pipeline2/chip.wdl, inputs=/labs/mpsnyder/neekonsu/2021/workspace/abc/CAPER/oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/input.local.json, backend_conf=/labs/mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/20210307_162323_253155/backend.conf
2021-03-07 16:23:46,169|caper.cromwell_workflow_monitor|INFO| Workflow: id=44992c81-836e-4fab-8ce1-dd5a9f5801ec, status=Submitted
2021-03-07 16:23:46,232|caper.cromwell_workflow_monitor|INFO| Workflow: id=44992c81-836e-4fab-8ce1-dd5a9f5801ec, status=Running
2021-03-07 16:23:55,760|caper.cromwell_workflow_monitor|INFO| Task: id=44992c81-836e-4fab-8ce1-dd5a9f5801ec, task=chip.read_genome_tsv:-1, retry=0, status=Started, job_id=10452
2021-03-07 16:23:55,772|caper.cromwell_workflow_monitor|INFO| Task: id=44992c81-836e-4fab-8ce1-dd5a9f5801ec, task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode

Pipeline Output java error

2021-03-07 16:21:08,258|caper.cromwell|INFO| Validating WDL/inputs/imports with Womtool...
2021-03-07 16:21:08,282|caper.cromwell|ERROR| RC=127
STDERR=java: symbol lookup error: java: undefined symbol: JLI_InitArgProcessing

Womtool validation failed.

Cromwell log

2021-03-07 15:58:52,392 cromwell-system-akka.dispatchers.backend-dispatcher-95 WARN  - slurm [UUID(116d9c9e)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-03-07 15:58:54,133 cromwell-system-akka.dispatchers.engine-dispatcher-48 INFO  - Not triggering log of token queue status. Effective log interval = None
2021-03-07 15:58:56,738 cromwell-system-akka.dispatchers.engine-dispatcher-119 INFO  - WorkflowExecutionActor-116d9c9e-2d0e-48ec-8d91-e017729e6bb7 [UUID(116d9c9e)]: Starting chip.read_genome_tsv
2021-03-07 15:58:57,142 cromwell-system-akka.dispatchers.engine-dispatcher-97 INFO  - Assigned new job execution tokens to the following groups: 116d9c9e: 1
2021-03-07 15:58:57,286 cromwell-system-akka.dispatchers.engine-dispatcher-97 INFO  - 116d9c9e-2d0e-48ec-8d91-e017729e6bb7-EngineJobExecutionActor-chip.read_genome_tsv:NA:1 [UUID(116d9c9e)]: Could not copy a suitable cache hit for 116d9c9e:chip.read_genome_tsv:-1:1. No copy attempts were made.
2021-03-07 15:58:57,313 cromwell-system-akka.dispatchers.backend-dispatcher-189 WARN  - BackgroundConfigAsyncJobExecutionActor [UUID(116d9c9e)chip.read_genome_tsv:NA:1]: Unrecognized runtime attribute keys: disks
2021-03-07 15:58:58,703 cromwell-system-akka.dispatchers.backend-dispatcher-189 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(116d9c9e)chip.read_genome_tsv:NA:1]: `echo "$(basename /oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER2/chip/116d9c9e-2d0e-48ec-8d91-e017729e6bb7/call-read_genome_tsv/inputs/-37529440/hg38.local.tsv)" > genome_name
leepc12 commented 3 years ago

What is Java version on your system? Can you re-install Java and try again?

undefined symbol: JLI_InitArgProcessing
neekonsu commented 3 years ago

@leepc12 please see the java version. I reinstalled java within the pipeline's environment and loaded the 8u112 version with slurm.

(encode-chip-seq-pipeline) [neekonsu@dper7425-srcf-d10-37 abc]$ java -version
openjdk version "1.8.0_112"
OpenJDK Runtime Environment (Zulu 8.19.0.1-linux64) (build 1.8.0_112-b16)
OpenJDK 64-Bit Server VM (Zulu 8.19.0.1-linux64) (build 25.112-b16, mixed mode)

I am referencing the java version from within the conda environment. I am not certain which installation the pipeline uses, but I am assuming it will only interact with programs within the vm.

The output is unchanged for the pipeline:

2021-04-10 09:45:56,808|caper.cromwell|INFO| Validating WDL/inputs/imports with Womtool...
2021-04-10 09:46:02,208|caper.cromwell|INFO| Womtool validation passed.
2021-04-10 09:46:02,209|caper.caper_runner|INFO| launching run: wdl=/labs/mpsnyder/neekonsu/2021/workspace/abc/chip-seq-pipeline2/chip.wdl, inputs=/labs/mpsnyder/neekonsu/2021/workspace/abc/CAPER/oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/input.local.json, backend_conf=/labs/mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/20210410_094548_158947/backend.conf
2021-04-10 09:46:14,767|caper.cromwell_workflow_monitor|INFO| Workflow: id=adc68428-2a88-4f1a-b5c2-7ab9d3c26af1, status=Submitted
2021-04-10 09:46:14,823|caper.cromwell_workflow_monitor|INFO| Workflow: id=adc68428-2a88-4f1a-b5c2-7ab9d3c26af1, status=Running
2021-04-10 09:46:24,311|caper.cromwell_workflow_monitor|INFO| Task: id=adc68428-2a88-4f1a-b5c2-7ab9d3c26af1, task=chip.read_genome_tsv:-1, retry=0, status=Started, job_id=13441
2021-04-10 09:46:24,321|caper.cromwell_workflow_monitor|INFO| Task: id=adc68428-2a88-4f1a-b5c2-7ab9d3c26af1, task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode
leepc12 commented 3 years ago

Please post your cromwell.out. It's on the working where you ran the command line.

neekonsu commented 3 years ago

@leepc12

2021-04-10 09:46:04,687  INFO  - Running with database db.url = jdbc:hsqldb:mem:54710032-58a6-4278-a482-3db9604649aa;shutdown=false;hsqldb.tx=mvcc
2021-04-10 09:46:13,341  INFO  - Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
2021-04-10 09:46:13,360  INFO  - [RenameWorkflowOptionsInMetadata] 100%
2021-04-10 09:46:13,515  INFO  - Running with database db.url = jdbc:hsqldb:mem:dd707d38-56cd-40eb-a2e5-05e3f48c0c88;shutdown=false;hsqldb.tx=mvcc
2021-04-10 09:46:13,973  INFO  - Slf4jLogger started
2021-04-10 09:46:14,176 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-40973d6",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "failureShutdownDuration" : "5 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
2021-04-10 09:46:14,236 cromwell-system-akka.dispatchers.service-dispatcher-9 INFO  - Metadata summary refreshing every 1 second.
2021-04-10 09:46:14,259  WARN  - 'docker.hash-lookup.gcr-api-queries-per-100-seconds' is being deprecated, use 'docker.hash-lookup.gcr.throttle' instead (see reference.conf)
2021-04-10 09:46:14,275 cromwell-system-akka.actor.default-dispatcher-23 INFO  - KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
2021-04-10 09:46:14,276 cromwell-system-akka.dispatchers.engine-dispatcher-52 INFO  - CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
2021-04-10 09:46:14,277 cromwell-system-akka.dispatchers.service-dispatcher-8 INFO  - WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
2021-04-10 09:46:14,677 cromwell-system-akka.dispatchers.engine-dispatcher-52 INFO  - JobExecutionTokenDispenser - Distribution rate: 1 per 2 seconds.
2021-04-10 09:46:14,705 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - SingleWorkflowRunnerActor: Version 52
2021-04-10 09:46:14,712 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - SingleWorkflowRunnerActor: Submitting workflow
2021-04-10 09:46:14,767 cromwell-system-akka.dispatchers.api-dispatcher-54 INFO  - Unspecified type (Unspecified version) workflow adc68428-2a88-4f1a-b5c2-7ab9d3c26af1 submitted
2021-04-10 09:46:14,791 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - SingleWorkflowRunnerActor: Workflow submitted UUID(adc68428-2a88-4f1a-b5c2-7ab9d3c26af1)
2021-04-10 09:46:14,797 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - 1 new workflows fetched by cromid-40973d6: adc68428-2a88-4f1a-b5c2-7ab9d3c26af1
2021-04-10 09:46:14,815 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - WorkflowManagerActor Starting workflow UUID(adc68428-2a88-4f1a-b5c2-7ab9d3c26af1)
2021-04-10 09:46:14,822 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - WorkflowManagerActor Successfully started WorkflowActor-adc68428-2a88-4f1a-b5c2-7ab9d3c26af1
2021-04-10 09:46:14,823 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - Retrieved 1 workflows from the WorkflowStoreActor
2021-04-10 09:46:14,849 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes.
2021-04-10 09:46:14,987 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - MaterializeWorkflowDescriptorActor [UUID(adc68428)]: Parsing workflow as WDL 1.0
2021-04-10 09:46:17,728 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - MaterializeWorkflowDescriptorActor [UUID(adc68428)]: Call-to-Backend assignments: chip.read_genome_tsv -> slurm, chip.subsample_ctl -> slurm, chip.error_custom_aligner -> slurm, chip.pool_ta -> slurm, chip.spr -> slurm, chip.count_signal_track_pooled -> slurm, chip.error_input_data -> slurm, chip.xcor -> slurm, chip.error_subsample_pooled_control_with_mixed_endedness -> slurm, chip.error_wrong_aligner -> slurm, chip.error_use_bwa_mem_for_non_bwa -> slurm, chip.bam2ta -> slurm, chip.filter_R1 -> slurm, chip.reproducibility_overlap -> slurm, chip.error_control_required -> slurm, chip.qc_report -> slurm, chip.macs2_signal_track_pooled -> slurm, chip.gc_bias -> slurm, chip.pool_ta_pr2 -> slurm, chip.reproducibility_idr -> slurm, chip.error_ctl_fastq_input_required_for_control_mode -> slurm, chip.call_peak_ppr2 -> slurm, chip.filter_no_dedup -> slurm, chip.choose_ctl -> slurm, chip.error_ctl_input_defined_in_control_mode -> slurm, chip.call_peak_ppr1 -> slurm, chip.pool_blacklist -> slurm, chip.call_peak_pr2 -> slurm, chip.overlap_ppr -> slurm, chip.idr_pr -> slurm, chip.filter -> slurm, chip.idr_ppr -> slurm, chip.bam2ta_no_dedup_R1 -> slurm, chip.bam2ta_no_dedup -> slurm, chip.idr -> slurm, chip.overlap_pr -> slurm, chip.align -> slurm, chip.align_ctl -> slurm, chip.count_signal_track -> slurm, chip.filter_ctl -> slurm, chip.call_peak_pooled -> slurm, chip.bam2ta_ctl -> slurm, chip.jsd -> slurm, chip.call_peak_pr1 -> slurm, chip.pool_ta_pr1 -> slurm, chip.align_R1 -> slurm, chip.call_peak -> slurm, chip.subsample_ctl_pooled -> slurm, chip.macs2_signal_track -> slurm, chip.fraglen_mean -> slurm, chip.overlap -> slurm, chip.pool_ta_ctl -> slurm
2021-04-10 09:46:18,074 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,076 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,076 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,080 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,080 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,081 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,081 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,082 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,083 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,083 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,083 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,084 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,084 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,085 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,085 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,086 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,086 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,087 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,087 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,087 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,087 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,088 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,088 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,089 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,089 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,089 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,090 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,090 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,090 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,091 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,091 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,092 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,092 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,092 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,093 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,093 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,093 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,093 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,094 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,095 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,095 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,095 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,095 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,096 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,096 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,096 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,097 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,098 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,098 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,098 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,099 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:18,100 cromwell-system-akka.dispatchers.backend-dispatcher-111 WARN  - slurm [UUID(adc68428)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-04-10 09:46:19,690 cromwell-system-akka.dispatchers.engine-dispatcher-32 INFO  - Not triggering log of token queue status. Effective log interval = None
2021-04-10 09:46:22,558 cromwell-system-akka.dispatchers.engine-dispatcher-32 INFO  - WorkflowExecutionActor-adc68428-2a88-4f1a-b5c2-7ab9d3c26af1 [UUID(adc68428)]: Starting chip.read_genome_tsv
2021-04-10 09:46:22,704 cromwell-system-akka.dispatchers.engine-dispatcher-147 INFO  - Assigned new job execution tokens to the following groups: adc68428: 1
2021-04-10 09:46:22,907 cromwell-system-akka.dispatchers.engine-dispatcher-153 INFO  - adc68428-2a88-4f1a-b5c2-7ab9d3c26af1-EngineJobExecutionActor-chip.read_genome_tsv:NA:1 [UUID(adc68428)]: Could not copy a suitable cache hit for adc68428:chip.read_genome_tsv:-1:1. No copy attempts were made.
2021-04-10 09:46:22,932 cromwell-system-akka.dispatchers.backend-dispatcher-166 WARN  - BackgroundConfigAsyncJobExecutionActor [UUID(adc68428)chip.read_genome_tsv:NA:1]: Unrecognized runtime attribute keys: disks
2021-04-10 09:46:23,010 cromwell-system-akka.dispatchers.backend-dispatcher-166 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(adc68428)chip.read_genome_tsv:NA:1]: `echo "$(basename /oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/adc68428-2a88-4f1a-b5c2-7ab9d3c26af1/call-read_genome_tsv/inputs/-37529440/hg38.local.tsv)" > genome_name
# create empty files for all entries
touch ref_fa bowtie2_idx_tar bwa_idx_tar chrsz gensz blacklist blacklist2
touch mito_chr_name
touch regex_bfilt_peak_chr_name

python <<CODE
import os
with open('/oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/adc68428-2a88-4f1a-b5c2-7ab9d3c26af1/call-read_genome_tsv/inputs/-37529440/hg38.local.tsv','r') as fp:
    for line in fp:
        arr = line.strip('\n').split('\t')
        if arr:
            key, val = arr
            with open(key,'w') as fp2:
                fp2.write(val)
CODE`
2021-04-10 09:46:23,267 cromwell-system-akka.dispatchers.backend-dispatcher-166 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(adc68428)chip.read_genome_tsv:NA:1]: executing: if [ -z \"$SINGULARITY_BINDPATH\" ]; then export SINGULARITY_BINDPATH=; fi; \
if [ -z \"$SINGULARITY_CACHEDIR\" ]; then export SINGULARITY_CACHEDIR=; fi;

ITER=0
until [ $ITER -ge 3 ]; do
    sbatch \
        --export=ALL \
        -J cromwell_adc68428_read_genome_tsv \
        -D /oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/adc68428-2a88-4f1a-b5c2-7ab9d3c26af1/call-read_genome_tsv \
        -o /oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/adc68428-2a88-4f1a-b5c2-7ab9d3c26af1/call-read_genome_tsv/execution/stdout \
        -e /oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/adc68428-2a88-4f1a-b5c2-7ab9d3c26af1/call-read_genome_tsv/execution/stderr \
        -t 60 \
        -n 1 \
        --ntasks-per-node=1 \
        --cpus-per-task=1 \
        --mem=2048 \
         \
        --account default \
         \
         \
        --wrap "/bin/bash /oak/stanford/scg/lab_mpsnyder/neekonsu/2021/workspace/abc/CAPER/chip/adc68428-2a88-4f1a-b5c2-7ab9d3c26af1/call-read_genome_tsv/execution/script" \
        && break
    ITER=$[$ITER+1]
    sleep 30
done
2021-04-10 09:46:24,311 cromwell-system-akka.dispatchers.backend-dispatcher-166 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(adc68428)chip.read_genome_tsv:NA:1]: job id: 13441
2021-04-10 09:46:24,321 cromwell-system-akka.dispatchers.backend-dispatcher-166 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(adc68428)chip.read_genome_tsv:NA:1]: Status change from - to WaitingForReturnCode
leepc12 commented 3 years ago

How do you usually submit jobs on SCG? Do you submit it with sbatch -a default ...?

neekonsu commented 3 years ago

@leepc12 I am running my job with ondemand's job composer dashboard:

#!/bin/bash
#SBATCH --job-name=default
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --partition=interactive
#SBATCH --account=default
#SBATCH --time=10:00:00

cd /labs/mpsnyder/neekonsu/2021/workspace/abc/CAPER

module load java/8u112

source activate encode-chip-seq-pipeline

conda install -c bioconda java-jdk

caper run /labs/mpsnyder/neekonsu/2021/workspace/abc/chip-seq-pipeline2/chip.wdl -i "input.json"
leepc12 commented 3 years ago

So caper run in this shell script will be a master job to submit children tasks to the SLURM job manager. caper will call sbatch -a default internally since only default account is defined in caper's conf. Can you add slurm-partition=interactive to your caper conf?

Also reduce number of CPUs in the shell script since the master does not require much resources. It requires long time though. Please increase 10h too. I think 2 cpus with 10GB memory are enough for a caper run master job.