ENCODE-DCC / caper

Cromwell/WDL wrapper for Python
MIT License
54 stars 18 forks source link

Pipeline hangs on: "task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode" #119

Closed Batchu-Sai closed 3 years ago

Batchu-Sai commented 3 years ago

Describe the problem

When running the encode chip-seq-pipeline on HPC with SLURM, it consistently hangs on the line that reads "task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode". I have tried running the pipeline with and without an active server and cannot overcome this error.

OS/Platform

Caper configuration file

backend=slurm

# define one of the followings (or both) according to your
# cluster's SLURM configuration.
slurm-partition=long
slurm-account=science

# Hashing strategy for call-caching (3 choices)
# This parameter is for local (local/slurm/sge/pbs) backend only.
# This is important for call-caching,
# which means re-using outputs from previous/failed workflows.
# Cache will miss if different strategy is used.
# "file" method has been default for all old versions of Caper<1.0.
# "path+modtime" is a new default for Caper>=1.0,
#   file: use md5sum hash (slow).
#   path: use path.
#   path+modtime: use path and modification time.
local-hash-strat=path+modtime

# Local directory for localized files and Cromwell's intermediate files
# If not defined, Caper will make .caper_tmp/ on local-out-dir or CWD.
# /tmp is not recommended here since Caper store all localized data files
# on this directory (e.g. input FASTQs defined as URLs in input JSON).
local-loc-dir=/home/hpc/batchus1/caper_tmp/

cromwell=/home/hpc/batchus1/.caper/cromwell_jar/cromwell-59.jar
womtool=/home/hpc/batchus1/.caper/womtool_jar/womtool-59.jar

Input JSON file

{
    "chip.title" : "RT4",
    "chip.description" : "RT4",

    "chip.pipeline_type" : "histone",
    "chip.aligner" : "bwa",
    "chip.align_only" : false,
    "chip.true_rep_only" : false,

    "chip.genome_tsv" : "https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v3/hg38.tsv",

    "chip.paired_end" : true,
    "chip.ctl_paired_end" : true,

    "chip.always_use_pooled_ctl" : true,

    "chip.fastqs_rep1_R1" : [ "/home/hpc/batchus1/GSE148079/SRR11478947_1.fastq" ],
    "chip.fastqs_rep1_R2" : [ "/home/hpc/batchus1/GSE148079/SRR11478947_2.fastq" ],
    "chip.fastqs_rep2_R1" : [ "/home/hpc/batchus1/GSE148079/SRR11478948_1.fastq" ],
    "chip.fastqs_rep2_R2" : [ "/home/hpc/batchus1/GSE148079/SRR11478948_2.fastq" ],

    "chip.ctl_fastqs_rep1_R1" : [ "/home/hpc/batchus1/GSE148079/SRR11478957_1.fastq" ],
    "chip.ctl_fastqs_rep1_R2" : [ "/home/hpc/batchus1/GSE148079/SRR11478957_2.fastq" ],
    "chip.ctl_fastqs_rep2_R1" : [ "/home/hpc/batchus1/GSE148079/SRR11478958_1.fastq" ],
    "chip.ctl_fastqs_rep2_R2" : [ "/home/hpc/batchus1/GSE148079/SRR11478958_2.fastq" ]
}

Sbatch Submission Script

#!/bin/bash

#SBATCH --ntasks-per-node=1
#SBATCH --partition=long 
#SBATCH --export=ALL 
#SBATCH --time=4-00:00:00 
#SBATCH --cpus-per-task=12
#SBATCH --account=science

source ~/miniconda3/etc/profile.d/conda.sh
conda activate encode-chip-seq-pipeline
caper run /home/hpc/batchus1/chip-seq-pipeline2/chip.wdl -i /home/hpc/batchus1/param_short.json

Troubleshooting result

SLURM output file

2021-05-13 12:39:33,587|caper.caper_base|INFO| Creating a timestamped temporary directory. /home/hpc/batchus1/caper_tmp/chip/20210513_123933_586540
2021-05-13 12:39:33,587|caper.caper_runner|INFO| Localizing files on work_dir. /home/hpc/batchus1/caper_tmp/chip/20210513_123933_586540
2021-05-13 12:39:35,039|autouri.autouri|INFO| cp: skipped due to name_size_match, size=872949833, mt=1549739698.0, src=https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.gz, dest=/home/hpc/batchus1/caper_tmp/caf534ed3cf684406e731d19be272b4a/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.gz
2021-05-13 12:39:35,796|autouri.autouri|INFO| cp: skipped due to md5_match, md5=05297d96dd1f7cfb45a7b637d6dd7036, src=https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only.fasta.gz, dest=/home/hpc/batchus1/caper_tmp/f43b63a83784d3ec8055f1a22168ed89/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only.fasta.gz
2021-05-13 12:39:36,631|autouri.autouri|INFO| cp: skipped due to md5_match, md5=393688b4f06c9ce26165d47433dd8c37, src=https://www.encodeproject.org/files/ENCFF356LFX/@@download/ENCFF356LFX.bed.gz, dest=/home/hpc/batchus1/caper_tmp/f183dcba5d34f959d8b55ed438ee2e22/ENCFF356LFX.bed.gz
2021-05-13 12:39:38,246|autouri.autouri|INFO| cp: skipped due to md5_match, md5=c95303fb77cc3e11d50e3c3a4b93b3fb, src=https://www.encodeproject.org/files/GRCh38_EBV.chrom.sizes/@@download/GRCh38_EBV.chrom.sizes.tsv, dest=/home/hpc/batchus1/caper_tmp/c52f52c7bfa357f55a39b1de7e4d0b0c/GRCh38_EBV.chrom.sizes.tsv
2021-05-13 12:39:39,532|autouri.autouri|INFO| cp: skipped due to name_size_match, size=3749246230, mt=1571469011.0, src=https://www.encodeproject.org/files/ENCFF110MCL/@@download/ENCFF110MCL.tar.gz, dest=/home/hpc/batchus1/caper_tmp/3ff4ac4c3f59d096b1a3842a182072ae/ENCFF110MCL.tar.gz
2021-05-13 12:39:40,386|autouri.autouri|INFO| cp: skipped due to md5_match, md5=80b263f6ea6ff65d547eef07102535db, src=https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only_bowtie2_index/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only_bowtie2_index.tar.gz, dest=/home/hpc/batchus1/caper_tmp/df5193e07055d13c48be59bacd0f56b8/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only_bowtie2_index.tar.gz
2021-05-13 12:39:41,713|autouri.autouri|INFO| cp: skipped due to name_size_match, size=4318261891, mt=1549723866.0, src=https://www.encodeproject.org/files/ENCFF643CGH/@@download/ENCFF643CGH.tar.gz, dest=/home/hpc/batchus1/caper_tmp/8c692fba4640609720272154ab0faa30/ENCFF643CGH.tar.gz
2021-05-13 12:39:42,490|autouri.autouri|INFO| cp: skipped due to md5_match, md5=7e088c24a017a43b1db5e8f50060eec1, src=https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only_bwa_index/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only_bwa_index.tar.gz, dest=/home/hpc/batchus1/caper_tmp/d3dff25534e93d893902540d81e4f475/GRCh38_no_alt_analysis_set_GCA_000001405.15_mito_only_bwa_index.tar.gz
2021-05-13 12:39:43,365|autouri.autouri|INFO| cp: skipped due to md5_match, md5=aca8cf959206aa3ad257fc46dc783266, src=https://www.encodeproject.org/files/ENCFF493CCB/@@download/ENCFF493CCB.bed.gz, dest=/home/hpc/batchus1/caper_tmp/0fa7d04b32e66fa02fb2c1ae39e41447/ENCFF493CCB.bed.gz
2021-05-13 12:39:44,640|autouri.autouri|INFO| cp: skipped due to name_size_match, size=14377496, mt=1592463730.0, src=https://www.encodeproject.org/files/ENCFF304XEX/@@download/ENCFF304XEX.bed.gz, dest=/home/hpc/batchus1/caper_tmp/805e179275a9c0fb7a37def40c4312d1/ENCFF304XEX.bed.gz
2021-05-13 12:39:45,485|autouri.autouri|INFO| cp: skipped due to md5_match, md5=91047588129069ff91ec1b0664179f8e, src=https://www.encodeproject.org/files/ENCFF140XLU/@@download/ENCFF140XLU.bed.gz, dest=/home/hpc/batchus1/caper_tmp/0cbd2c602ddad252bc39729fc8a29286/ENCFF140XLU.bed.gz
2021-05-13 12:39:46,764|autouri.autouri|INFO| cp: skipped due to name_size_match, size=18381891, mt=1592463727.0, src=https://www.encodeproject.org/files/ENCFF212UAV/@@download/ENCFF212UAV.bed.gz, dest=/home/hpc/batchus1/caper_tmp/1d3aa436b05f16a509edb94789c061d3/ENCFF212UAV.bed.gz
2021-05-13 12:39:46,923|autouri.autouri|INFO| cp: skipped due to md5_match, md5=df624401f76fbd4d651e736068c43a1a, src=https://storage.googleapis.com/encode-pipeline-genome-data/hg38/ataqc/hg38_dnase_avg_fseq_signal_formatted.txt.gz, dest=/home/hpc/batchus1/caper_tmp/3b39284516e676ea52238f0636c0bbbf/hg38_dnase_avg_fseq_signal_formatted.txt.gz
2021-05-13 12:39:46,996|autouri.autouri|INFO| cp: skipped due to md5_match, md5=ced0c653d28628654288f7a8ab052590, src=https://storage.googleapis.com/encode-pipeline-genome-data/hg38/ataqc/hg38_celltype_compare_subsample.bed.gz, dest=/home/hpc/batchus1/caper_tmp/c73f434c3fa4f3f54bc2ecad09c065c2/hg38_celltype_compare_subsample.bed.gz
2021-05-13 12:39:47,086|autouri.autouri|INFO| cp: skipped due to md5_match, md5=3f7fd85ab9a4c6274f28c3e82a79c10d, src=https://storage.googleapis.com/encode-pipeline-genome-data/hg38/ataqc/hg38_dnase_avg_fseq_signal_metadata.txt, dest=/home/hpc/batchus1/caper_tmp/a9745b33b4ffdd83d7d2c5a7d3c8036a/hg38_dnase_avg_fseq_signal_metadata.txt
2021-05-13 12:39:47,850|caper.cromwell|INFO| Validating WDL/inputs/imports with Womtool...
2021-05-13 12:39:51,199|caper.cromwell|INFO| Womtool validation passed.
2021-05-13 12:39:51,200|caper.caper_runner|INFO| launching run: wdl=/home/hpc/batchus1/chip-seq-pipeline2/chip.wdl, inputs=/home/hpc/batchus1/caper_tmp/home/hpc/batchus1/param_short.local.json, backend_conf=/home/hpc/batchus1/caper_tmp/chip/20210513_123933_586540/backend.conf
2021-05-13 12:40:00,685|caper.cromwell_workflow_monitor|INFO| Workflow: id=3b6d19ac-dd11-4e6f-9246-af6ffb9af467, status=Submitted
2021-05-13 12:40:00,729|caper.cromwell_workflow_monitor|INFO| Workflow: id=3b6d19ac-dd11-4e6f-9246-af6ffb9af467, status=Running
2021-05-13 12:40:10,133|caper.cromwell_workflow_monitor|INFO| Task: id=3b6d19ac-dd11-4e6f-9246-af6ffb9af467, task=chip.read_genome_tsv:-1, retry=0, status=Started, job_id=44018
2021-05-13 12:40:10,139|caper.cromwell_workflow_monitor|INFO| Task: id=3b6d19ac-dd11-4e6f-9246-af6ffb9af467, task=chip.read_genome_tsv:-1, retry=0, status=WaitingForReturnCode

cromwell.out file

2021-05-13 12:39:52,796  INFO  - Running with database db.url = jdbc:hsqldb:mem:4bb55e12-96c4-4bba-a1f8-78ea07e02915;shutdown=false;hsqldb.tx=mvcc
2021-05-13 12:39:59,327  INFO  - Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
2021-05-13 12:39:59,338  INFO  - [RenameWorkflowOptionsInMetadata] 100%
2021-05-13 12:39:59,417  INFO  - Running with database db.url = jdbc:hsqldb:mem:cac1db66-f6bc-4aaa-a0e5-adb74c67f90b;shutdown=false;hsqldb.tx=mvcc
2021-05-13 12:39:59,754  INFO  - Slf4jLogger started
2021-05-13 12:39:59,952 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-ee56600",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "failureShutdownDuration" : "5 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
2021-05-13 12:40:00,005 cromwell-system-akka.dispatchers.service-dispatcher-13 INFO  - Metadata summary refreshing every 1 second.
2021-05-13 12:40:00,096 cromwell-system-akka.dispatchers.service-dispatcher-12 INFO  - WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
2021-05-13 12:40:00,100 cromwell-system-akka.actor.default-dispatcher-4 INFO  - KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
2021-05-13 12:40:00,123 cromwell-system-akka.dispatchers.engine-dispatcher-42 INFO  - CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
2021-05-13 12:40:00,124  WARN  - 'docker.hash-lookup.gcr-api-queries-per-100-seconds' is being deprecated, use 'docker.hash-lookup.gcr.throttle' instead (see reference.conf)
2021-05-13 12:40:00,610 cromwell-system-akka.dispatchers.engine-dispatcher-42 INFO  - JobExecutionTokenDispenser - Distribution rate: 1 per 2 seconds.
2021-05-13 12:40:00,637 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - SingleWorkflowRunnerActor: Version 59
2021-05-13 12:40:00,643 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO  - SingleWorkflowRunnerActor: Submitting workflow
2021-05-13 12:40:00,685 cromwell-system-akka.dispatchers.api-dispatcher-47 INFO  - Unspecified type (Unspecified version) workflow 3b6d19ac-dd11-4e6f-9246-af6ffb9af467 submitted
2021-05-13 12:40:00,711 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - SingleWorkflowRunnerActor: Workflow submitted UUID(3b6d19ac-dd11-4e6f-9246-af6ffb9af467)
2021-05-13 12:40:00,714 cromwell-system-akka.dispatchers.engine-dispatcher-42 INFO  - 1 new workflows fetched by cromid-ee56600: 3b6d19ac-dd11-4e6f-9246-af6ffb9af467
2021-05-13 12:40:00,722 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - WorkflowManagerActor: Starting workflow UUID(3b6d19ac-dd11-4e6f-9246-af6ffb9af467)
2021-05-13 12:40:00,728 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - WorkflowManagerActor: Successfully started WorkflowActor-3b6d19ac-dd11-4e6f-9246-af6ffb9af467
2021-05-13 12:40:00,729 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - Retrieved 1 workflows from the WorkflowStoreActor
2021-05-13 12:40:00,743 cromwell-system-akka.dispatchers.engine-dispatcher-42 INFO  - WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes.
2021-05-13 12:40:00,804 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - MaterializeWorkflowDescriptorActor [UUID(3b6d19ac)]: Parsing workflow as WDL 1.0
2021-05-13 12:40:03,540 cromwell-system-akka.dispatchers.engine-dispatcher-41 INFO  - MaterializeWorkflowDescriptorActor [UUID(3b6d19ac)]: Call-to-Backend assignments: chip.align_R1 -> slurm, chip.bam2ta_ctl -> slurm, chip.idr_pr -> slurm, chip.filter_no_dedup -> slurm, chip.pool_ta_ctl -> slurm, chip.error_subsample_pooled_control_with_mixed_endedness -> slurm, chip.overlap_ppr -> slurm, chip.filter -> slurm, chip.qc_report -> slurm, chip.error_wrong_aligner -> slurm, chip.call_peak_pooled -> slurm, chip.filter_R1 -> slurm, chip.pool_ta_pr2 -> slurm, chip.spr -> slurm, chip.filter_ctl -> slurm, chip.call_peak_pr1 -> slurm, chip.error_custom_aligner -> slurm, chip.count_signal_track_pooled -> slurm, chip.error_ctl_fastq_input_required_for_control_mode -> slurm, chip.call_peak_ppr1 -> slurm, chip.idr_ppr -> slurm, chip.call_peak -> slurm, chip.error_control_required -> slurm, chip.align -> slurm, chip.jsd -> slurm, chip.error_input_data -> slurm, chip.reproducibility_overlap -> slurm, chip.idr -> slurm, chip.call_peak_ppr2 -> slurm, chip.bam2ta -> slurm, chip.pool_ta -> slurm, chip.xcor -> slurm, chip.macs2_signal_track -> slurm, chip.overlap -> slurm, chip.subsample_ctl -> slurm, chip.macs2_signal_track_pooled -> slurm, chip.call_peak_pr2 -> slurm, chip.gc_bias -> slurm, chip.read_genome_tsv -> slurm, chip.pool_blacklist -> slurm, chip.error_use_bowtie2_local_mode_for_non_bowtie2 -> slurm, chip.error_use_bwa_mem_for_non_bwa -> slurm, chip.fraglen_mean -> slurm, chip.bam2ta_no_dedup_R1 -> slurm, chip.bam2ta_no_dedup -> slurm, chip.pool_ta_pr1 -> slurm, chip.subsample_ctl_pooled -> slurm, chip.count_signal_track -> slurm, chip.align_ctl -> slurm, chip.overlap_pr -> slurm, chip.error_ctl_input_defined_in_control_mode -> slurm, chip.choose_ctl -> slurm, chip.reproducibility_idr -> slurm
2021-05-13 12:40:03,813 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,814 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,815 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,815 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,816 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,816 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,816 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,817 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,817 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,817 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,817 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,818 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,818 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,818 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,818 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,819 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,819 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,819 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,820 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,820 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,820 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,820 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,821 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,821 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,821 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,821 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,822 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,822 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,822 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,822 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,823 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,823 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,824 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,824 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,824 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,825 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,825 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,825 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,826 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,826 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,826 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,826 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,826 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,827 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,827 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,827 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,827 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,827 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,828 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,830 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,831 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,831 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:03,831 cromwell-system-akka.dispatchers.backend-dispatcher-101 WARN  - slurm [UUID(3b6d19ac)]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
2021-05-13 12:40:05,631 cromwell-system-akka.dispatchers.engine-dispatcher-152 INFO  - Not triggering log of token queue status. Effective log interval = None
2021-05-13 12:40:08,202 cromwell-system-akka.dispatchers.engine-dispatcher-152 INFO  - WorkflowExecutionActor-3b6d19ac-dd11-4e6f-9246-af6ffb9af467 [UUID(3b6d19ac)]: Starting chip.read_genome_tsv
2021-05-13 12:40:08,645 cromwell-system-akka.dispatchers.engine-dispatcher-152 INFO  - Assigned new job execution tokens to the following groups: 3b6d19ac: 1
2021-05-13 12:40:08,808 cromwell-system-akka.dispatchers.engine-dispatcher-112 INFO  - 3b6d19ac-dd11-4e6f-9246-af6ffb9af467-EngineJobExecutionActor-chip.read_genome_tsv:NA:1 [UUID(3b6d19ac)]: Could not copy a suitable cache hit for 3b6d19ac:chip.read_genome_tsv:-1:1. No copy attempts were made.
2021-05-13 12:40:08,836 cromwell-system-akka.dispatchers.backend-dispatcher-156 WARN  - BackgroundConfigAsyncJobExecutionActor [UUID(3b6d19ac)chip.read_genome_tsv:NA:1]: Unrecognized runtime attribute keys: disks
2021-05-13 12:40:08,914 cromwell-system-akka.dispatchers.backend-dispatcher-156 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(3b6d19ac)chip.read_genome_tsv:NA:1]: `echo "$(basename /home/hpc/batchus1/RT4_output/chip/3b6d19ac-dd11-4e6f-9246-af6ffb9af467/call-read_genome_tsv/inputs/955696674/hg38.local.tsv)" > genome_name
# create empty files for all entries
touch ref_fa bowtie2_idx_tar bwa_idx_tar chrsz gensz blacklist blacklist2
touch mito_chr_name
touch regex_bfilt_peak_chr_name

python <<CODE
import os
with open('/home/hpc/batchus1/RT4_output/chip/3b6d19ac-dd11-4e6f-9246-af6ffb9af467/call-read_genome_tsv/inputs/955696674/hg38.local.tsv','r') as fp:
    for line in fp:
        arr = line.strip('\n').split('\t')
        if arr:
            key, val = arr
            with open(key,'w') as fp2:
                fp2.write(val)
CODE`
2021-05-13 12:40:09,169 cromwell-system-akka.dispatchers.backend-dispatcher-156 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(3b6d19ac)chip.read_genome_tsv:NA:1]: executing: if [ -z \"$SINGULARITY_BINDPATH\" ]; then export SINGULARITY_BINDPATH=; fi; \
if [ -z \"$SINGULARITY_CACHEDIR\" ]; then export SINGULARITY_CACHEDIR=; fi;

ITER=0
until [ $ITER -ge 3 ]; do
    sbatch \
        --export=ALL \
        -J cromwell_3b6d19ac_read_genome_tsv \
        -D /home/hpc/batchus1/RT4_output/chip/3b6d19ac-dd11-4e6f-9246-af6ffb9af467/call-read_genome_tsv \
        -o /home/hpc/batchus1/RT4_output/chip/3b6d19ac-dd11-4e6f-9246-af6ffb9af467/call-read_genome_tsv/execution/stdout \
        -e /home/hpc/batchus1/RT4_output/chip/3b6d19ac-dd11-4e6f-9246-af6ffb9af467/call-read_genome_tsv/execution/stderr \
        -t 60 \
        -n 1 \
        --ntasks-per-node=1 \
        --cpus-per-task=1 \
        --mem=2048 \
        -p long \
        --account science \
         \
         \
        --wrap "/bin/bash /home/hpc/batchus1/RT4_output/chip/3b6d19ac-dd11-4e6f-9246-af6ffb9af467/call-read_genome_tsv/execution/script" \
        && break
    ITER=$[$ITER+1]
    sleep 30
done
2021-05-13 12:40:10,132 cromwell-system-akka.dispatchers.backend-dispatcher-156 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(3b6d19ac)chip.read_genome_tsv:NA:1]: job id: 44018
2021-05-13 12:40:10,139 cromwell-system-akka.dispatchers.backend-dispatcher-157 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(3b6d19ac)chip.read_genome_tsv:NA:1]: Status change from - to WaitingForReturnCode
Batchu-Sai commented 3 years ago

So I found the problem:

The HPC I am using is not yet configured to enforce memory limitations so I had to override the built-in backends with a custom Caper configuration file (only removed the memory variable). After this change, the pipeline completed successfully with no errors.

Commenting incase anyone else runs into this problem to check if Caper's built-in backends don't work as expected on their clusters.