ENCODE-DCC / atac-seq-pipeline

ENCODE ATAC-seq pipeline
MIT License
385 stars 172 forks source link

Failed to run example ENCSR356KRQ_subsampled_caper.json #213

Closed fangpingmu closed 4 years ago

fangpingmu commented 4 years ago

Describe the bug I tried to run atac-seq-pipeline using example json file at https://raw.githubusercontent.com/ENCODE-DCC/atac-seq-pipeline/master/example_input_json/caper/ENCSR356KRQ_subsampled_caper.json

The log file is long and the first error appears as,

2020-01-13 17:13:51,325 cromwell-system-akka.dispatchers.engine-dispatcher-43 INFO - WorkflowManagerActor Workflow 667052e2-f822-4a44-86dd-2ad86bf348c7 failed (during ExecutingWorkflowState): cromwell.backend.standard.StandardAsyncExecutionActor$$anon$2: Failed to evaluate job outputs: Bad output 'align_mito.bam': Failed to find index Success(WomInteger(0)) on array: Success([]) 0 Bad output 'align_mito.bai': Failed to find index Success(WomInteger(0)) on array: Success([]) 0 Bad output 'align_mito.samstat_qc': Failed to find index Success(WomInteger(0)) on array: Success([]) 0 Bad output 'align_mito.non_mito_samstat_qc': Failed to find index Success(WomInteger(0)) on array: Success([]) 0 Bad output 'align_mito.read_len_log': Failed to find index Success(WomInteger(0)) on array: Success([]) 0 Bad output 'align_mito.read_len': key not found: read_len_log at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$handleExecutionSuccess$1(StandardAsyncExecutionActor.scala:916) at scala.util.Success.$anonfun$map$1(Try.scala:255) at scala.util.Success.map(Try.scala:213) at scala.concurrent.Future.$anonfun$map$1(Future.scala:292) at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85) at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) cromwell.backend.standard.StandardAsyncExecutionActor$$anon$2: Failed to evaluate job outputs: Bad output 'align_mito.bam': Failed to find index Success(WomInteger(0)) on array: at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$handleExecutionSuccess$1(StandardAsyncExecutionActor.scala:916) at scala.util.Success.$anonfun$map$1(Try.scala:255) at scala.util.Success.map(Try.scala:213) at scala.concurrent.Future.$anonfun$map$1(Future.scala:292) at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85) at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

OS/Platform

Caper configuration file [defaults]

Input JSON file https://raw.githubusercontent.com/ENCODE-DCC/atac-seq-pipeline/master/example_input_json/caper/ENCSR356KRQ_subsampled_caper.json

Error log Caper automatically runs a troubleshooter for failed workflows. If it doesn't then get a WORKFLOW_ID of your failed workflow with caper list or directly use a metadata.json file on Caper's output directory.

$ caper debug [WORKFLOW_ID_OR_METADATA_JSON_FILE]
leepc12 commented 4 years ago

I am sorry about late response. Did you install/activate pipeline's Conda env before running a pipeline?

$ conda activate encode-atac-seq-pipeline
apblair commented 4 years ago

Hello @leepc12 ,

Thank you for building a robust debugger and powerful tool! I'm running into a similar error as @fangpingmu. Please note, I am working on the UCSF Wytnon HPC, which uses the Son Grid Engine (SGE) scheduler.

Here are my steps after cloning atac-seq pipeline v1.7 and caper v0.6.5 and downloading NCSR356KRQ_subsampled_caper.json

1. Initialize caper and set default.conf. For the purpose of debugging I've initialized this to a local environment. $ caper init local $ mkdir /wynton/home/bruneau/ablair/test $ vim ~/.caper/default.conf backend=local tmp-dir=/wynton/home/bruneau/ablair/test

2. Execute build using singularity. $ cd /wynton/home/bruneau/ablair/test $ caper run /wynton/home/bruneau/ablair/atac-seq-pipeline/atac.wdl -i /wynton/home/bruneau/ablair/ENCSR356KRQ_subsampled_caper.json --singularity --no-build-singularity

3. Use caper to assess why build failed $ cd /wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78 $ ls call-align call-align_mito call-read_genome_tsv metadata.json $ caper debug metadata.json Found failures: [ { "message": "Workflow failed", "causedBy": [ { "causedBy": [ { "causedBy": [], "message": "Bad output 'align_mito.bam': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.bai': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.non_mito_samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.read_len_log': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.read_len': key not found: read_len_log" } ], "message": "Failed to evaluate job outputs" }, { "message": "Failed to evaluate job outputs", "causedBy": [ { "causedBy": [], "message": "Bad output 'align_mito.bam': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.bai': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.non_mito_samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, "causedBy": [], "message": "Bad output 'align_mito.read_len_log': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align_mito.read_len': key not found: read_len_log" } ] }, { "message": "Failed to evaluate job outputs", "causedBy": [ { "message": "Bad output 'align.bam': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0", "causedBy": [] }, { "causedBy": [], "message": "Bad output 'align.bai': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.non_mito_samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.read_len_log': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.read_len': key not found: read_len_log" } ] }, { "causedBy": [ { "message": "Bad output 'align.bam': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0", "causedBy": [] }, { "causedBy": [], "message": "Bad output 'align.bai': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.non_mito_samstat_qc': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.read_len_log': Failed to find index Success(WomInteger(0)) on array:\n\nSuccess([])\n\n0" }, { "causedBy": [], "message": "Bad output 'align.read_len': key not found: read_len_log" } ], "message": "Failed to evaluate job outputs" } ] } ]

atac.align_mito Failed. SHARD_IDX=0, RC=None, JOB_ID=32732, RUN_START=2020-02-24T20:19:13.754Z, RUN_END=2020-02-24T20:20:30.780Z, STDOUT=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align_mito/shard-0/execution/stdout, STDERR=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align_mito/shard-0/execution/stderr STDERR_CONTENTS=

atac.align_mito Failed. SHARD_IDX=1, RC=None, JOB_ID=398, RUN_START=2020-02-24T20:19:15.751Z, RUN_END=2020-02-24T20:21:03.867Z, STDOUT=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align_mito/shard-1/execution/stdout, STDERR=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align_mito/shard-1/execution/stderr STDERR_CONTENTS=

atac.align Failed. SHARD_IDX=0, RC=None, JOB_ID=32595, RUN_START=2020-02-24T20:19:09.782Z, RUN_END=2020-02-24T20:52:39.729Z, STDOUT=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-0/execution/stdout, STDERR=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-0/execution/stderr STDERR_CONTENTS=

atac.align Failed. SHARD_IDX=1, RC=None, JOB_ID=32698, RUN_START=2020-02-24T20:19:11.755Z, RUN_END=2020-02-24T20:52:43.579Z, STDOUT=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/stdout, STDERR=/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/stderr STDERR_CONTENTS=

4. Print out of the last atac.align failure's stderr.background report at: /wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/

ln: failed to create hard link '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/glob-3bcbe4e7489c90f75e0523ac6f3a9385/ENCFF641SFZ.subsampled.400.trim.merged.bam' => '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/ENCFF641SFZ.subsampled.400.trim.merged.bam': Operation not permitted ln: failed to create hard link '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/glob-6efbc60cb1e0959bab4e467327a9416c/ENCFF641SFZ.subsampled.400.trim.merged.bam.bai' => '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/ENCFF641SFZ.subsampled.400.trim.merged.bam.bai': Operation not permitted ln: failed to create hard link '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/glob-7b38d9959cf6f3deb83ac2bd156d8317/ENCFF641SFZ.subsampled.400.trim.merged.samstats.qc' => '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/ENCFF641SFZ.subsampled.400.trim.merged.samstats.qc': Operation not permitted ln: failed to create hard link '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/glob-bc1afa799665df5c7d6afd70d2ae2cb4/ENCFF641SFZ.subsampled.400.trim.merged.no_chrM.samstats.qc' => '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/non_mito/ENCFF641SFZ.subsampled.400.trim.merged.no_chrM.samstats.qc': Operation not permitted ln: failed to create hard link '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/glob-773fb92850749a2b4a829cf3c8c4de27/ENCFF641SFZ.subsampled.400.trim.merged.read_length.txt' => '/wynton/home/bruneau/ablair/test/atac/2269c24e-2237-47be-a05b-20213ef0fc78/call-align/shard-1/execution/ENCFF641SFZ.subsampled.400.trim.merged.read_length.txt': Operation not permitted

I'll try running this again with the background.conf configured with sge and let you know if I receive a similar report. If the hard link error persists is there a way for us to set soft links?

Many thanks, Andrew

apblair commented 4 years ago

Hi @leepc12,

Here is my update after configuring caper to run on SGE.

1. Update caper default.conf $ caper init sge $ qconf -spl mpi mpi-8 mpi_onehost smp $ vim ~/.caper/default.conf backend=sge sge-pe=mpi tmp-dir=/wynton/home/bruneau/ablair/test

2. Execute build using singularity caper run /wynton/home/bruneau/ablair/atac-seq-pipeline/atac.wdl -i /wynton/home/bruneau/ablair/ENCSR356KRQ_subsampled_caper.json --singularity --no-build-singularity

3. Print out of caper's debug

Found failures: [ { "message": "Workflow failed", "causedBy": [ { "message": "java.lang.RuntimeException: Could not find job ID from stdout file.Check the stderr file for possible errors: /wynton/home/bruneau/ablair/test/atac/7ba75f43-9abb-4202-a8dc-56396251d317/call-read_genome_tsv/execution/stderr.submit", "causedBy": [ { "message": "Could not find job ID from stdout file.Check the stderr file for possible errors: /wynton/home/bruneau/ablair/test/atac/7ba75f43-9abb-4202-a8dc-56396251d317/call-read_genome_tsv/execution/stderr.submit", "causedBy": [] } ] } ] } ]

atac.read_genome_tsv Failed. SHARD_IDX=-1, RC=None, JOB_ID=None, RUN_START=2020-02-24T22:51:43.015Z, RUN_END=2020-02-24T22:51:45.800Z, STDOUT=/wynton/home/bruneau/ablair/test/atac/7ba75f43-9abb-4202-a8dc-56396251d317/call-read_genome_tsv/execution/stdout, STDERR=/wynton/home/bruneau/ablair/test/atac/7ba75f43-9abb-4202-a8dc-56396251d317/call-read_genome_tsv/execution/stderr STDERR_CONTENTS= time="2020-02-24T14:52:12-08:00" level=warning msg="\"/run/user/35073\" directory set by $XDG_RUNTIME_DIR does not exist. Either create the directory or unset $XDG_RUNTIME_DIR.: stat /run/user/35073: no such file or directory: Trying to pull image in the event that it is a public image." ESC[31mFATAL: ESC[0m Unable to handle docker://quay.io/encode-dcc/atac-seq-pipeline:v1.7.0 uri: failed to get SHA of docker://quay.io/encode-dcc/atac-seq-pipeline:v1.7.0: pinging docker registry returned: Get https://quay.io/v2/: proxyconnect tcp: dial tcp 172.19.0.250:80: i/o timeout

4. Print out of the atac.read_genome_tsv failure's stderr report at /wynton/home/bruneau/ablair/test/atac/7ba75f43-9abb-4202-a8dc-56396251d317/call-read_genome_tsv/execution

time="2020-02-24T14:52:12-08:00" level=warning msg="\"/run/user/35073\" directory set by $XDG_RUNTIME_DIR does not exist. Either create the directory or unset $XDG_RUNTIME_DIR.: stat /run/user/35073: no such file or directory: Trying to pull image in the event that it is a public image." FATAL: Unable to handle docker://quay.io/encode-dcc/atac-seq-pipeline:v1.7.0 uri: failed to get SHA of docker://quay.io/encode-dcc/atac-seq-pipeline:v1.7.0: pinging docker registry returned: Get https://quay.io/v2/: proxyconnect tcp: dial tcp 172.19.0.250:80: i/o timeout


I also tried submitting to one of our compute nodes. Here are the steps and error messages:

1. Submit job specifying the build to use singularity echo "caper run /wynton/home/bruneau/ablair/atac-seq-pipeline/atac.wdl -i /wynton/home/bruneau/ablair/ENCSR356KRQ_subsampled_caper.json --singularity --no-build-singularity" | qsub -V -N test -l h_rt=01:00:00 -l mem_free=2G -l eth_speed=20 2. View error report Traceback (most recent call last): File "/wynton/home/bruneau/ablair/.local/bin/caper", line 13, in main() File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper.py", line 1267, in main c.run() File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper.py", line 197, in run input_file = self.create_input_json_file(tmp_dir) File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper.py", line 681, in create_input_json_file uri_type=uri_type, uri_exts=self._deepcopy_ext) File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 811, in deepcopy uri_type=uri_type, uri_exts=uri_exts) File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 780, in deepcopy_json updated = recurse_dict(new_d, uri_type) File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 753, in recurse_dict d_parent_key=k, updated=updated) File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 762, in recurse_dict uri_type=uri_type, uri_exts=uri_exts) File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 814, in deepcopy uri_type=uri_type, uri_exts=uri_exts, delim='\t') File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 709, in deepcopy_tsv contents = self.get_file_contents() File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 485, in get_file_contents ['curl', '-L', '-f', self._uri]) File "/wynton/home/bruneau/ablair/.local/lib/python3.6/site-packages/caper/caper_uri.py", line 988, in __curl_auto_auth rc, http_err, stderr))

Exception: cURL RC: 7, HTTP_ERR: 0, STDERR: % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:02:06 --:--:-- 0curl: (7) Failed connect to 172.19.0.250:80; Connection timed out

Please let me know if you require anymore information.

Thanks, Andrew

leepc12 commented 4 years ago

I think there are two failures.

1) Caper failed to fetch a docker image (to build a singularity image from it) from quay.io.

Unable to handle docker://quay.io/encode-dcc/atac-seq-pipeline:v1.7.0

Define a singularity image on dockerhub instead: --singularity docker://encodedcc/atac-seq-pipeline:v1.7.0. BTW Does your cluster allow internet connection on compute/login nodes?

If this fails then try with a Conda method.

2) /run/user/35073" directory set by $XDG_RUNTIME_DIR does not exist. Either create the directory or unset $XDG_RUNTIME_DIR What does this error message mean?

fangpingmu commented 4 years ago

This is the issue with hardlinks in cromwell's glob outputs.

A solution is provided in https://github.com/ENCODE-DCC/chip-seq-pipeline2/issues/91

I was able to resolve the problem by downloading the cromwell source code, modifying the globLinkCommand to use soft instead of hard links: .getOrElse("( ln -sL GLOB_PATTERN GLOB_DIRECTORY 2> /dev/null ) || ( ln -s GLOB_PATTERN GLOB_DIRECTORY )") and building a new cromwell jar file as per their instructions. After updating ~/.caper/default.conf to use this newly built jar, the workflow proceeded as expected.

I do recommend that caper add an option to change hardlink to soft link.

We have lost interests in Cromwell pipelines. We know that AWS and GCP requires hard link. However, the hard link does not work under most HPC POSIX file systems. Many people have reported this problem to Cromwell developers. The Cromwell community never care these complains.

leepc12 commented 4 years ago

@fangpingmu: I think Cromwell has a parameter glob-link-command to control it. https://github.com/broadinstitute/cromwell/pull/5250

I can add it to Caper's next release or you can make a backend file to override Caper's built-it backends.

caper run/server --backend-file your.backend.conf

your.backend.conf should look like. Define it for any backend you want.

backend {
  providers {
    Local {
      config {
        glob-link-command = "ln -sL GLOB_PATTERN GLOB_DIRECTORY"
      }
    }
    sge {
      config {
        glob-link-command = "ln -sL GLOB_PATTERN GLOB_DIRECTORY"
      }
    }
    slurm {
      config {
        glob-link-command = "ln -sL GLOB_PATTERN GLOB_DIRECTORY"
      }
    }
  }
}
apblair commented 4 years ago

Thanks for the fast responses. Regarding each reply:

  1. I don't believe our compute node has internet connection but our dev node does, which is where I ran the local build. I'll reach out to our admin to confirm the compute node internet access.

  2. I will also have to ask our admin regarding the $XDG_RUNTIME_DIR error. I can let you know what they say but I suspect this is a user group permission setting.

  3. How long does caper run/server --backend-file your.backend.conf usually run? I defined my backend conf for sge and it's been running since ~4pm yesterday.

Thanks, Andrew

leepc12 commented 4 years ago
  1. Running caper init [YOUR_PLATFORM] on dev nodes will init Caper's conf file and also download Cromwell and Womtool locally so your pipeline can work offline later on.

  2. Okay.

  3. It depends on the size of your data and your cluster's node resource/availability. For big samples, it can take > 1 day on my cluster (Stanford Sherlock).