When merging and scattering arrays from the same step (like in the example workflow below) we got the error The given job is already a predecessor of this job
This error is specific to toil and does not happen when using cwltool
Below How The error and How to reproduce it
Version used
jfouret@compute-dedicated:/home/users/jfouret/toil_test$ toil --version
5.2.0a1
jfouret@compute-dedicated:/home/users/jfouret/toil_test$ which cwltool
/exports/./../home/users/jfouret/toil/venv/bin/cwltool
jfouret@compute-dedicated:/home/users/jfouret/toil_test$ which toil-cwl-runner
/exports/./../home/users/jfouret/toil/venv/bin/toil-cwl-runner
built from 4e7d12a65793c4a4c1c4e2b61109c19ad2756f02
jfouret@compute-dedicated:/home/users/jfouret/toil_test$ toil-cwl-runner --jobStore file:$PWD/jobstore --logLevel DEBUG --maxCores 16 --maxMemory 50G --strict-memory-limit --force-docker-pull --outdir outdir echo-string.cwl
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.statsAndLogging] Suppressing the following loggers: {'boto3', 'google', 'cachecontrol', 'websocket', 'humanfriendly', 'botocore', 'oauthlib', 'urllib3', 'requests_oauthlib', 'boto', 'prov', 'rdflib', 'kubernetes', 'bcdocs', 'dill', 'salad', 'docker', 'requests', 'galaxy'}
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.statsAndLogging] Root logger is at level 'DEBUG', 'toil' logger at level 'DEBUG'.
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] Total machine size: 16 cores
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] CPU quota: -1
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] Total machine size: 16 cores
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] CPU quota: -1
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.jobStores.fileJobStore] Path to job store directory is '/home/users/jfouret/toil_test/jobstore'.
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.jobStores.abstractJobStore] The workflow ID is: '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd'
[2021-01-06T13:00:04+0100] [MainThread] [I] [cwltool] Resolved 'echo-string.cwl' to 'file:///home/users/jfouret/toil_test/echo-string.cwl'
[2021-01-06T13:00:05+0100] [MainThread] [D] [cwltool] Parsed job order from command line: {
"id": "echo-string.cwl",
"arrayS": [
"hello",
"world"
]
}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Using the single machine batch system
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Created the workflow directory for this machine at /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Started daddy thread.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Started in normal mode.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] User script ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False) belongs to Toil. No need to auto-deploy it.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] No user script to auto-deploy.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Written the environment for the jobs to the environment file
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Caching all jobs in job store
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] 0 jobs downloaded.
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.job] Saving graph of 2 jobs, 2 new
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.job] Processing job 'ResolveIndirect' kind-ResolveIndirect/instance-k4qbud0u
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.job] Processing job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.realtimeLogger] Real-time logging disabled
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.toilState] Found job to run: kind-CWLWorkflow/instance-kefe7e_w, with command: True, with checkpoint: False, with services: False, with no next successors: False
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Found 1 jobs to start and 0 jobs with successors to run
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Checked batch system has no running jobs and no updated jobs
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.serviceManager] Initializing service manager
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Built the jobs list, currently have 1 jobs to update and 0 jobs issued
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Updating status of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with result status: 0
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Issuing the command: _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w with memory: 2147483648, cores: 1.0, disk: 2147483648
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Issued job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with job batch system ID: 0 and cores: 1, disk: 2.0 G, and memory: 2.0 G
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Launched job 0 as child 17706
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.statsAndLogging] Suppressing the following loggers: {'cachecontrol', 'prov', 'humanfriendly', 'bcdocs', 'urllib3', 'salad', 'websocket', 'dill', 'google', 'requests_oauthlib', 'requests', 'rdflib', 'boto3', 'galaxy', 'botocore', 'kubernetes', 'oauthlib', 'docker', 'boto'}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] Redirecting logging to /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/tmp6lq7py0f/worker_log.txt
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Deleting /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funchqj9mn3v
[2021-01-06T13:00:05+0100] [Thread-1 ] [E] [toil.batchSystems.singleMachine] Got exit code 1 (indicating failure) from job _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w.
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Child 17706 for job 0 succeeded
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Ran jobID: 0 with exit value: 1
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w continues to exist (i.e. has more to do)
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.leader] Log from job "kind-CWLWorkflow/instance-kefe7e_w" follows:
=========>
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funchqj9mn3v
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.worker] Parsed job description
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] Working on job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.worker] Got a command to run: _toil files/for-job/kind-CWLWorkflow/instance-kefe7e_w/cleanup/file-c5d21eb9af93442f86d7064cf9608268/stream /home/users/jfouret/toil/src toil.cwl.cwltoil False
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.job] Loading user module ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False).
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] Loaded body Job('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) from description 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Starting job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w).
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running job
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Actually running job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w) which wants 2147483648 of our 646134366208 bytes.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Max space: 646134366208
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total file size: 0
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total job disk requirement size: 2147483648
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total immutable reference size: 0
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Available space with job: 646134366208 bytes
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w used 0.00% (4.0 KB [4096B] used, 2.0 GB [2147483648B] requested) at the end of its run.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running own deferred functions
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Out of deferred functions!
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
Traceback (most recent call last):
File "/home/users/jfouret/toil/src/toil/worker.py", line 394, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/home/users/jfouret/toil/src/toil/job.py", line 2359, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/home/users/jfouret/toil/src/toil/job.py", line 2280, in _run
return self.run(fileStore)
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 1660, in run
promises[s].addFollowOn(wfjob)
File "/home/users/jfouret/toil/src/toil/job.py", line 1211, in addFollowOn
followOnJob._addPredecessor(self)
File "/home/users/jfouret/toil/src/toil/job.py", line 1806, in _addPredecessor
raise RuntimeError("The given job is already a predecessor of this job")
RuntimeError: The given job is already a predecessor of this job
[2021-01-06T13:00:05+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host compute-dedicated
<=========
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with ID kind-CWLWorkflow/instance-kefe7e_w to 1
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Added job: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w to active jobs
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.leader] 0 jobs are running, 0 jobs are issued and waiting to run
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Built the jobs list, currently have 1 jobs to update and 0 jobs issued
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Updating status of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with result status: 1
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Issuing the command: _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w with memory: 2147483648, cores: 1.0, disk: 2147483648
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Issued job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with job batch system ID: 1 and cores: 1, disk: 2.0 G, and memory: 2.0 G
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Launched job 1 as child 17712
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.statsAndLogging] Suppressing the following loggers: {'rdflib', 'kubernetes', 'galaxy', 'urllib3', 'salad', 'websocket', 'oauthlib', 'boto', 'cachecontrol', 'bcdocs', 'google', 'humanfriendly', 'dill', 'boto3', 'docker', 'requests', 'botocore', 'prov', 'requests_oauthlib'}
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Redirecting logging to /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/tmp5rtu1skc/worker_log.txt
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Deleting /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcrd9m6h8u
[2021-01-06T13:00:06+0100] [Thread-1 ] [E] [toil.batchSystems.singleMachine] Got exit code 1 (indicating failure) from job _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w.
[2021-01-06T13:00:06+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Child 17712 for job 1 succeeded
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Ran jobID: 1 with exit value: 1
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w continues to exist (i.e. has more to do)
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] Log from job "kind-CWLWorkflow/instance-kefe7e_w" follows:
=========>
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcrd9m6h8u
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Parsed job description
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Working on job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Got a command to run: _toil files/for-job/kind-CWLWorkflow/instance-kefe7e_w/cleanup/file-c5d21eb9af93442f86d7064cf9608268/stream /home/users/jfouret/toil/src toil.cwl.cwltoil False
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.job] Loading user module ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False).
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Loaded body Job('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) from description 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Starting job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w).
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running job
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Actually running job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w) which wants 2147483648 of our 646134366208 bytes.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Max space: 646134366208
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total file size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total job disk requirement size: 2147483648
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total immutable reference size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Available space with job: 646134366208 bytes
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w used 0.00% (4.0 KB [4096B] used, 2.0 GB [2147483648B] requested) at the end of its run.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running own deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Out of deferred functions!
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
Traceback (most recent call last):
File "/home/users/jfouret/toil/src/toil/worker.py", line 394, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/home/users/jfouret/toil/src/toil/job.py", line 2359, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/home/users/jfouret/toil/src/toil/job.py", line 2280, in _run
return self.run(fileStore)
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 1660, in run
promises[s].addFollowOn(wfjob)
File "/home/users/jfouret/toil/src/toil/job.py", line 1211, in addFollowOn
followOnJob._addPredecessor(self)
File "/home/users/jfouret/toil/src/toil/job.py", line 1806, in _addPredecessor
raise RuntimeError("The given job is already a predecessor of this job")
RuntimeError: The given job is already a predecessor of this job
[2021-01-06T13:00:06+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host compute-dedicated
<=========
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with ID kind-CWLWorkflow/instance-kefe7e_w to 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Added job: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w to active jobs
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Built the jobs list, currently have 1 jobs to update and 0 jobs issued
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Updating status of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with result status: 1
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Found new failed successors: kind-ResolveIndirect/instance-k4qbud0u of job: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w is completely failed
[2021-01-06T13:00:08+0100] [MainThread] [D] [toil.leader] Finished the main loop: no jobs left to run.
[2021-01-06T13:00:08+0100] [MainThread] [D] [toil.serviceManager] Waiting for service manager thread to finish ...
[2021-01-06T13:00:09+0100] [Thread-2 ] [D] [toil.serviceManager] Received signal to quit starting services.
[2021-01-06T13:00:10+0100] [MainThread] [D] [toil.serviceManager] ... finished shutting down the service manager. Took 1.7502992153167725 seconds
[2021-01-06T13:00:10+0100] [MainThread] [D] [toil.statsAndLogging] Waiting for stats and logging collator thread to finish ...
[2021-01-06T13:00:10+0100] [MainThread] [D] [toil.statsAndLogging] ... finished collating stats and logs. Took 0.0006322860717773438 seconds
[2021-01-06T13:00:14+0100] [MainThread] [I] [toil.leader] Finished toil run with 1 failed jobs.
[2021-01-06T13:00:14+0100] [MainThread] [I] [toil.leader] Failed jobs at end of the run: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.common] Shutting down batch system ...
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Cleaning up deferred functions system
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcuzd9xmsm
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Deleting /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcuzd9xmsm
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Connecting to latest caching database cache-0.db for cleanup
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.common] ... finished shutting down the batch system in 0.003482818603515625 seconds.
Traceback (most recent call last):
File "/exports/./../home/users/jfouret/toil/venv/bin/toil-cwl-runner", line 33, in <module>
sys.exit(load_entry_point('toil', 'console_scripts', 'toil-cwl-runner')())
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 2250, in main
outobj = toil.start(wf1)
File "/home/users/jfouret/toil/src/toil/common.py", line 842, in start
return self._runMainLoop(rootJobDescription)
File "/home/users/jfouret/toil/src/toil/common.py", line 1135, in _runMainLoop
jobCache=self._jobCache).run()
File "/home/users/jfouret/toil/src/toil/leader.py", line 262, in run
raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
toil.leader.FailedJobsException: The job store 'file:/home/users/jfouret/toil_test/jobstore' contains 1 failed jobs: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
Log from job "'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w" follows:
=========>
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcrd9m6h8u
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Parsed job description
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Working on job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Got a command to run: _toil files/for-job/kind-CWLWorkflow/instance-kefe7e_w/cleanup/file-c5d21eb9af93442f86d7064cf9608268/stream /home/users/jfouret/toil/src toil.cwl.cwltoil False
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.job] Loading user module ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False).
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Loaded body Job('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) from description 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Starting job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w).
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running job
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Actually running job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w) which wants 2147483648 of our 646134366208 bytes.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Max space: 646134366208
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total file size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total job disk requirement size: 2147483648
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total immutable reference size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Available space with job: 646134366208 bytes
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w used 0.00% (4.0 KB [4096B] used, 2.0 GB [2147483648B] requested) at the end of its run.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running own deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Out of deferred functions!
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
Traceback (most recent call last):
File "/home/users/jfouret/toil/src/toil/worker.py", line 394, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/home/users/jfouret/toil/src/toil/job.py", line 2359, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/home/users/jfouret/toil/src/toil/job.py", line 2280, in _run
return self.run(fileStore)
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 1660, in run
promises[s].addFollowOn(wfjob)
File "/home/users/jfouret/toil/src/toil/job.py", line 1211, in addFollowOn
followOnJob._addPredecessor(self)
File "/home/users/jfouret/toil/src/toil/job.py", line 1806, in _addPredecessor
raise RuntimeError("The given job is already a predecessor of this job")
RuntimeError: The given job is already a predecessor of this job
[2021-01-06T13:00:06+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host compute-dedicated
<=========
jfouret@compute-dedicated:/home/users/jfouret/toil_test$ cwltool echo-string.cwl
INFO /exports/./../home/users/jfouret/toil/venv/bin/cwltool 3.0.20201203173111
INFO Resolved 'echo-string.cwl' to 'file:///home/users/jfouret/toil_test/echo-string.cwl'
INFO [workflow ] start
INFO [workflow ] starting step hello
INFO [step hello] start
INFO [job hello] /tmp/vqzk_5ue$ echo \
hello > /tmp/vqzk_5ue/6ab573038c4ba4db9bd833536c9b2265f08a272e 2> /tmp/vqzk_5ue/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job hello] completed success
INFO [step hello] start
INFO [job hello_2] /tmp/lf9i0f73$ echo \
world > /tmp/lf9i0f73/6ab573038c4ba4db9bd833536c9b2265f08a272e 2> /tmp/lf9i0f73/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job hello_2] completed success
INFO [step hello] completed success
INFO [workflow ] starting step list
INFO [step list] start
INFO [job list] /tmp/7ngnbjl2$ ls \
-lh \
/tmp/qfcmenae/stgdf1b8826-cfe3-49c8-8131-5bcea19d8b8c/6ab573038c4ba4db9bd833536c9b2265f08a272e
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/qfcmenae/stgdf1b8826-cfe3-49c8-8131-5bcea19d8b8c/6ab573038c4ba4db9bd833536c9b2265f08a272e -> /tmp/vqzk_5ue/6ab573038c4ba4db9bd833536c9b2265f08a272e
INFO [job list] completed success
INFO [step list] start
INFO [job list_2] /tmp/hlv_pk4g$ ls \
-lh \
/tmp/rpxpvp3o/stg4ffd99b2-612d-4e50-b284-55382a2b45ff/6ab573038c4ba4db9bd833536c9b2265f08a272e
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/rpxpvp3o/stg4ffd99b2-612d-4e50-b284-55382a2b45ff/6ab573038c4ba4db9bd833536c9b2265f08a272e -> /tmp/lf9i0f73/6ab573038c4ba4db9bd833536c9b2265f08a272e
INFO [job list_2] completed success
INFO [step list] start
INFO [job list_3] /tmp/gg1rl9l_$ ls \
-lh \jfouret@compute-dedicated:/home/users/jfouret/toil_test$ toil-cwl-runner --jobStore file:$PWD/jobstore --logLevel DEBUG --maxCores 16 --maxMemory 50G --strict-memory-limit --force-docker-pull --outdir outdir echo-string.cwl
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.statsAndLogging] Suppressing the following loggers: {'boto3', 'google', 'cachecontrol', 'websocket', 'humanfriendly', 'botocore', 'oauthlib', 'urllib3', 'requests_oauthlib', 'boto', 'prov', 'rdflib', 'kubernetes', 'bcdocs', 'dill', 'salad', 'docker', 'requests', 'galaxy'}
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.statsAndLogging] Root logger is at level 'DEBUG', 'toil' logger at level 'DEBUG'.
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] Total machine size: 16 cores
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] CPU quota: -1
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] Total machine size: 16 cores
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.lib.threading] CPU quota: -1
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.jobStores.fileJobStore] Path to job store directory is '/home/users/jfouret/toil_test/jobstore'.
[2021-01-06T13:00:04+0100] [MainThread] [D] [toil.jobStores.abstractJobStore] The workflow ID is: '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd'
[2021-01-06T13:00:04+0100] [MainThread] [I] [cwltool] Resolved 'echo-string.cwl' to 'file:///home/users/jfouret/toil_test/echo-string.cwl'
[2021-01-06T13:00:05+0100] [MainThread] [D] [cwltool] Parsed job order from command line: {
"id": "echo-string.cwl",
"arrayS": [
"hello",
"world"
]
}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Using the single machine batch system
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Created the workflow directory for this machine at /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Started daddy thread.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Started in normal mode.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] User script ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False) belongs to Toil. No need to auto-deploy it.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] No user script to auto-deploy.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Written the environment for the jobs to the environment file
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Caching all jobs in job store
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] 0 jobs downloaded.
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.job] Saving graph of 2 jobs, 2 new
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.job] Processing job 'ResolveIndirect' kind-ResolveIndirect/instance-k4qbud0u
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.job] Processing job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.realtimeLogger] Real-time logging disabled
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.toilState] Found job to run: kind-CWLWorkflow/instance-kefe7e_w, with command: True, with checkpoint: False, with services: False, with no next successors: False
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Found 1 jobs to start and 0 jobs with successors to run
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Checked batch system has no running jobs and no updated jobs
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.serviceManager] Initializing service manager
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Built the jobs list, currently have 1 jobs to update and 0 jobs issued
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Updating status of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with result status: 0
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Issuing the command: _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w with memory: 2147483648, cores: 1.0, disk: 2147483648
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Issued job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with job batch system ID: 0 and cores: 1, disk: 2.0 G, and memory: 2.0 G
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Launched job 0 as child 17706
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.statsAndLogging] Suppressing the following loggers: {'cachecontrol', 'prov', 'humanfriendly', 'bcdocs', 'urllib3', 'salad', 'websocket', 'dill', 'google', 'requests_oauthlib', 'requests', 'rdflib', 'boto3', 'galaxy', 'botocore', 'kubernetes', 'oauthlib', 'docker', 'boto'}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] Redirecting logging to /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/tmp6lq7py0f/worker_log.txt
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Deleting /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funchqj9mn3v
[2021-01-06T13:00:05+0100] [Thread-1 ] [E] [toil.batchSystems.singleMachine] Got exit code 1 (indicating failure) from job _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w.
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Child 17706 for job 0 succeeded
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Ran jobID: 0 with exit value: 1
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w continues to exist (i.e. has more to do)
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.leader] Log from job "kind-CWLWorkflow/instance-kefe7e_w" follows:
=========>
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funchqj9mn3v
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.worker] Parsed job description
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] Working on job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.worker] Got a command to run: _toil files/for-job/kind-CWLWorkflow/instance-kefe7e_w/cleanup/file-c5d21eb9af93442f86d7064cf9608268/stream /home/users/jfouret/toil/src toil.cwl.cwltoil False
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.job] Loading user module ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False).
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.worker] Loaded body Job('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) from description 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Starting job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w).
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running job
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Actually running job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w) which wants 2147483648 of our 646134366208 bytes.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Max space: 646134366208
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total file size: 0
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total job disk requirement size: 2147483648
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total immutable reference size: 0
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Available space with job: 646134366208 bytes
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w used 0.00% (4.0 KB [4096B] used, 2.0 GB [2147483648B] requested) at the end of its run.
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running own deferred functions
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Out of deferred functions!
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
Traceback (most recent call last):
File "/home/users/jfouret/toil/src/toil/worker.py", line 394, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/home/users/jfouret/toil/src/toil/job.py", line 2359, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/home/users/jfouret/toil/src/toil/job.py", line 2280, in _run
return self.run(fileStore)
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 1660, in run
promises[s].addFollowOn(wfjob)
File "/home/users/jfouret/toil/src/toil/job.py", line 1211, in addFollowOn
followOnJob._addPredecessor(self)
File "/home/users/jfouret/toil/src/toil/job.py", line 1806, in _addPredecessor
raise RuntimeError("The given job is already a predecessor of this job")
RuntimeError: The given job is already a predecessor of this job
[2021-01-06T13:00:05+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host compute-dedicated
<=========
[2021-01-06T13:00:05+0100] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with ID kind-CWLWorkflow/instance-kefe7e_w to 1
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Added job: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w to active jobs
[2021-01-06T13:00:05+0100] [MainThread] [I] [toil.leader] 0 jobs are running, 0 jobs are issued and waiting to run
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Built the jobs list, currently have 1 jobs to update and 0 jobs issued
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Updating status of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with result status: 1
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Issuing the command: _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w with memory: 2147483648, cores: 1.0, disk: 2147483648
[2021-01-06T13:00:05+0100] [MainThread] [D] [toil.leader] Issued job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with job batch system ID: 1 and cores: 1, disk: 2.0 G, and memory: 2.0 G
[2021-01-06T13:00:05+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Launched job 1 as child 17712
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.statsAndLogging] Suppressing the following loggers: {'rdflib', 'kubernetes', 'galaxy', 'urllib3', 'salad', 'websocket', 'oauthlib', 'boto', 'cachecontrol', 'bcdocs', 'google', 'humanfriendly', 'dill', 'boto3', 'docker', 'requests', 'botocore', 'prov', 'requests_oauthlib'}
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Redirecting logging to /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/tmp5rtu1skc/worker_log.txt
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Deleting /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcrd9m6h8u
[2021-01-06T13:00:06+0100] [Thread-1 ] [E] [toil.batchSystems.singleMachine] Got exit code 1 (indicating failure) from job _toil_worker CWLWorkflow file:/home/users/jfouret/toil_test/jobstore kind-CWLWorkflow/instance-kefe7e_w.
[2021-01-06T13:00:06+0100] [Thread-1 ] [D] [toil.batchSystems.singleMachine] Child 17712 for job 1 succeeded
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.batchSystems.singleMachine] Ran jobID: 1 with exit value: 1
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w continues to exist (i.e. has more to do)
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] Log from job "kind-CWLWorkflow/instance-kefe7e_w" follows:
=========>
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcrd9m6h8u
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Parsed job description
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Working on job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Got a command to run: _toil files/for-job/kind-CWLWorkflow/instance-kefe7e_w/cleanup/file-c5d21eb9af93442f86d7064cf9608268/stream /home/users/jfouret/toil/src toil.cwl.cwltoil False
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.job] Loading user module ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False).
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Loaded body Job('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) from description 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Starting job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w).
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running job
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Actually running job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w) which wants 2147483648 of our 646134366208 bytes.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Max space: 646134366208
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total file size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total job disk requirement size: 2147483648
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total immutable reference size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Available space with job: 646134366208 bytes
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w used 0.00% (4.0 KB [4096B] used, 2.0 GB [2147483648B] requested) at the end of its run.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running own deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Out of deferred functions!
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
Traceback (most recent call last):
File "/home/users/jfouret/toil/src/toil/worker.py", line 394, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/home/users/jfouret/toil/src/toil/job.py", line 2359, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/home/users/jfouret/toil/src/toil/job.py", line 2280, in _run
return self.run(fileStore)
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 1660, in run
promises[s].addFollowOn(wfjob)
File "/home/users/jfouret/toil/src/toil/job.py", line 1211, in addFollowOn
followOnJob._addPredecessor(self)
File "/home/users/jfouret/toil/src/toil/job.py", line 1806, in _addPredecessor
raise RuntimeError("The given job is already a predecessor of this job")
RuntimeError: The given job is already a predecessor of this job
[2021-01-06T13:00:06+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host compute-dedicated
<=========
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with ID kind-CWLWorkflow/instance-kefe7e_w to 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Added job: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w to active jobs
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Built the jobs list, currently have 1 jobs to update and 0 jobs issued
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Updating status of job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w with result status: 1
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.leader] Found new failed successors: kind-ResolveIndirect/instance-k4qbud0u of job: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [W] [toil.leader] Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w is completely failed
[2021-01-06T13:00:08+0100] [MainThread] [D] [toil.leader] Finished the main loop: no jobs left to run.
[2021-01-06T13:00:08+0100] [MainThread] [D] [toil.serviceManager] Waiting for service manager thread to finish ...
[2021-01-06T13:00:09+0100] [Thread-2 ] [D] [toil.serviceManager] Received signal to quit starting services.
[2021-01-06T13:00:10+0100] [MainThread] [D] [toil.serviceManager] ... finished shutting down the service manager. Took 1.7502992153167725 seconds
[2021-01-06T13:00:10+0100] [MainThread] [D] [toil.statsAndLogging] Waiting for stats and logging collator thread to finish ...
[2021-01-06T13:00:10+0100] [MainThread] [D] [toil.statsAndLogging] ... finished collating stats and logs. Took 0.0006322860717773438 seconds
[2021-01-06T13:00:14+0100] [MainThread] [I] [toil.leader] Finished toil run with 1 failed jobs.
[2021-01-06T13:00:14+0100] [MainThread] [I] [toil.leader] Failed jobs at end of the run: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.common] Shutting down batch system ...
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.common] Obtained node ID af5ec22510434ef4a5723e75099c2291 from file /var/lib/dbus/machine-id
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Cleaning up deferred functions system
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcuzd9xmsm
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.deferred] Deleting /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcuzd9xmsm
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Connecting to latest caching database cache-0.db for cleanup
[2021-01-06T13:00:14+0100] [MainThread] [D] [toil.common] ... finished shutting down the batch system in 0.003482818603515625 seconds.
Traceback (most recent call last):
File "/exports/./../home/users/jfouret/toil/venv/bin/toil-cwl-runner", line 33, in <module>
sys.exit(load_entry_point('toil', 'console_scripts', 'toil-cwl-runner')())
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 2250, in main
outobj = toil.start(wf1)
File "/home/users/jfouret/toil/src/toil/common.py", line 842, in start
return self._runMainLoop(rootJobDescription)
File "/home/users/jfouret/toil/src/toil/common.py", line 1135, in _runMainLoop
jobCache=self._jobCache).run()
File "/home/users/jfouret/toil/src/toil/leader.py", line 262, in run
raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
toil.leader.FailedJobsException: The job store 'file:/home/users/jfouret/toil_test/jobstore' contains 1 failed jobs: 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
Log from job "'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w" follows:
=========>
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil] Running Toil version 5.2.0a1-4e7d12a65793c4a4c1c4e2b61109c19ad2756f02 on host compute-dedicated.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil] Configuration: {'workflowID': '41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd', 'workflowAttemptNumber': 0, 'jobStore': 'file:/home/users/jfouret/toil_test/jobstore', 'logLevel': 'DEBUG', 'workDir': None, 'noStdOutErr': False, 'stats': False, 'clean': 'onSuccess', 'cleanWorkDir': 'always', 'clusterStats': None, 'restart': False, 'batchSystem': 'single_machine', 'disableAutoDeployment': False, 'environment': {}, 'statePollingWait': 1, 'maxLocalJobs': 16, 'manualMemArgs': False, 'parasolCommand': 'parasol', 'parasolMaxBatches': 10000, 'scale': 1, 'linkImports': True, 'moveExports': True, 'mesosMasterAddress': '10.0.4.50:5050', 'kubernetesHostPath': None, 'provisioner': None, 'nodeTypes': [], 'minNodes': None, 'maxNodes': [10], 'targetTime': 1800, 'betaInertia': 0.1, 'scaleInterval': 60, 'preemptableCompensation': 0.0, 'nodeStorage': 50, 'nodeStorageOverrides': [], 'metrics': False, 'maxPreemptableServiceJobs': 9223372036854775807, 'maxServiceJobs': 9223372036854775807, 'deadlockWait': 60, 'deadlockCheckInterval': 30, 'defaultMemory': 2147483648, 'defaultCores': 1, 'defaultDisk': 2147483648, 'readGlobalFileMutableByDefault': False, 'defaultPreemptable': False, 'maxCores': 16, 'maxMemory': 53687091200, 'maxDisk': 9223372036854775807, 'retryCount': 1, 'enableUnlimitedPreemptableRetries': False, 'doubleMem': False, 'maxJobDuration': 9223372036854775807, 'rescueJobsFrequency': 3600, 'disableCaching': False, 'disableChaining': False, 'disableJobStoreChecksumVerification': False, 'maxLogFileSize': 64000, 'writeLogs': None, 'writeLogsGzip': None, 'writeLogsFromAllJobs': False, 'sseKey': None, 'servicePollingInterval': 60, 'useAsync': True, 'forceDockerAppliance': False, 'runCwlInternalJobsOnWorkers': False, 'statusWait': 3600, 'disableProgress': False, 'debugWorker': False, 'disableWorkerOutputCapture': False, 'badWorker': 0.0, 'badWorkerFailInterval': 0.01, 'cwl': False}
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running for file /tmp/node-41ac8c0b-29cc-420e-a7f6-7f1cdb8a2afd-af5ec22510434ef4a5723e75099c2291/deferred/funcrd9m6h8u
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Parsed job description
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Working on job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.worker] Got a command to run: _toil files/for-job/kind-CWLWorkflow/instance-kefe7e_w/cleanup/file-c5d21eb9af93442f86d7064cf9608268/stream /home/users/jfouret/toil/src toil.cwl.cwltoil False
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.job] Loading user module ModuleDescriptor(dirPath='/home/users/jfouret/toil/src', name='toil.cwl.cwltoil', fromVirtualEnv=False).
[2021-01-06T13:00:06+0100] [MainThread] [I] [toil.worker] Loaded body Job('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) from description 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Starting job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w).
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running job
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Actually running job ('CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w) with ID (kind-CWLWorkflow/instance-kefe7e_w) which wants 2147483648 of our 646134366208 bytes.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Max space: 646134366208
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total file size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total job disk requirement size: 2147483648
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Total immutable reference size: 0
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.cachingFileStore] Available space with job: 646134366208 bytes
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.resource] Module dir is /home/users/jfouret/toil/src, our prefix is /home/users/jfouret/toil/venv, virtualenv: True
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Job 'CWLWorkflow' kind-CWLWorkflow/instance-kefe7e_w used 0.00% (4.0 KB [4096B] used, 2.0 GB [2147483648B] requested) at the end of its run.
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running own deferred functions
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Out of deferred functions!
[2021-01-06T13:00:06+0100] [MainThread] [D] [toil.deferred] Running orphaned deferred functions
Traceback (most recent call last):
File "/home/users/jfouret/toil/src/toil/worker.py", line 394, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/home/users/jfouret/toil/src/toil/job.py", line 2359, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/home/users/jfouret/toil/src/toil/job.py", line 2280, in _run
return self.run(fileStore)
File "/home/users/jfouret/toil/src/toil/cwl/cwltoil.py", line 1660, in run
promises[s].addFollowOn(wfjob)
File "/home/users/jfouret/toil/src/toil/job.py", line 1211, in addFollowOn
followOnJob._addPredecessor(self)
File "/home/users/jfouret/toil/src/toil/job.py", line 1806, in _addPredecessor
raise RuntimeError("The given job is already a predecessor of this job")
RuntimeError: The given job is already a predecessor of this job
[2021-01-06T13:00:06+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host compute-dedicated
<=========
cwltool output
jfouret@compute-dedicated:/home/users/jfouret/toil_test$ cwltool echo-string.cwl
INFO /exports/./../home/users/jfouret/toil/venv/bin/cwltool 3.0.20201203173111
INFO Resolved 'echo-string.cwl' to 'file:///home/users/jfouret/toil_test/echo-string.cwl'
INFO [workflow ] start
INFO [workflow ] starting step hello
INFO [step hello] start
INFO [job hello] /tmp/vqzk_5ue$ echo \
hello > /tmp/vqzk_5ue/6ab573038c4ba4db9bd833536c9b2265f08a272e 2> /tmp/vqzk_5ue/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job hello] completed success
INFO [step hello] start
INFO [job hello_2] /tmp/lf9i0f73$ echo \
world > /tmp/lf9i0f73/6ab573038c4ba4db9bd833536c9b2265f08a272e 2> /tmp/lf9i0f73/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job hello_2] completed success
INFO [step hello] completed success
INFO [workflow ] starting step list
INFO [step list] start
INFO [job list] /tmp/7ngnbjl2$ ls \
-lh \
/tmp/qfcmenae/stgdf1b8826-cfe3-49c8-8131-5bcea19d8b8c/6ab573038c4ba4db9bd833536c9b2265f08a272e
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/qfcmenae/stgdf1b8826-cfe3-49c8-8131-5bcea19d8b8c/6ab573038c4ba4db9bd833536c9b2265f08a272e -> /tmp/vqzk_5ue/6ab573038c4ba4db9bd833536c9b2265f08a272e
INFO [job list] completed success
INFO [step list] start
INFO [job list_2] /tmp/hlv_pk4g$ ls \
-lh \
/tmp/rpxpvp3o/stg4ffd99b2-612d-4e50-b284-55382a2b45ff/6ab573038c4ba4db9bd833536c9b2265f08a272e
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/rpxpvp3o/stg4ffd99b2-612d-4e50-b284-55382a2b45ff/6ab573038c4ba4db9bd833536c9b2265f08a272e -> /tmp/lf9i0f73/6ab573038c4ba4db9bd833536c9b2265f08a272e
INFO [job list_2] completed success
INFO [step list] start
INFO [job list_3] /tmp/gg1rl9l_$ ls \
-lh \
/tmp/7e1g8ri7/stgef503449-599e-4250-80f6-618e9e5acb78/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/7e1g8ri7/stgef503449-599e-4250-80f6-618e9e5acb78/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7 -> /tmp/vqzk_5ue/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job list_3] completed success
INFO [step list] start
INFO [job list_4] /tmp/cp5azbdt$ ls \
-lh \
/tmp/18e66dp4/stgff198105-01d6-48c5-8f28-1d477c9643c8/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/18e66dp4/stgff198105-01d6-48c5-8f28-1d477c9643c8/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7 -> /tmp/lf9i0f73/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job list_4] completed success
INFO [step list] completed success
INFO [workflow ] completed success
{}
INFO Final process status is success
/tmp/7e1g8ri7/stgef503449-599e-4250-80f6-618e9e5acb78/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/7e1g8ri7/stgef503449-599e-4250-80f6-618e9e5acb78/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7 -> /tmp/vqzk_5ue/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job list_3] completed success
INFO [step list] start
INFO [job list_4] /tmp/cp5azbdt$ ls \
-lh \
/tmp/18e66dp4/stgff198105-01d6-48c5-8f28-1d477c9643c8/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
lrwxrwxrwx 1 jfouret employee 54 Jan 6 13:01 /tmp/18e66dp4/stgff198105-01d6-48c5-8f28-1d477c9643c8/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7 -> /tmp/lf9i0f73/cc7e4becc0c2d09944d5ac1fb18c4ecf874ac2d7
INFO [job list_4] completed success
INFO [step list] completed success
INFO [workflow ] completed success
{}
INFO Final process status is success
┆Issue is synchronized with this Jira Task
┆Issue Number: TOIL-774
Hi,
When merging and scattering arrays from the same step (like in the example workflow below) we got the error
The given job is already a predecessor of this job
This error is specific to toil and does not happen when using cwltool
Below How The error and How to reproduce it
Version used
built from 4e7d12a65793c4a4c1c4e2b61109c19ad2756f02
Test workflow:
Error log
cwltool output
┆Issue is synchronized with this Jira Task ┆Issue Number: TOIL-774