ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
481 stars 106 forks source link

DbServerService and CactusCafPhase failed on AWS #419

Open hehuiying1125 opened 3 years ago

hehuiying1125 commented 3 years ago

Hi, I ran cactus on AWS using c4.8xlarge and r3.8xlarge (cactus --nodeTypes c4.8xlarge:0.7,r3.8xlarge --minNodes 0,0 --maxNodes 10,50 --provisioner aws --batchSystem mesos --metrics aws:us-east-1:agis-2021-jobstore seqFile.txt result.hal --root ac --realTimeLogging --logFile cactus.log --targetTime 1 --restart). At beginning, it worked well. Some DBServiceService jobs and CactusCafPhase jobs always failed. Then the whole workflow failed. Log was as following: Job failed with exit value 1: 'DbServerService' 559c71c6-bb47-44ff-98e9-9bff2b07a11e The job seems to have left a log file, indicating failure: 'DbServerService' 559c71c6-bb47-44ff-98e9-9bff2b07a11e Log from job 559c71c6-bb47-44ff-98e9-9bff2b07a11e follows: =========> [2021-01-30T13:23:45+0000] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG--- [2021-01-30T13:23:45+0000] [MainThread] [I] [toil] Running Toil version 4.2.0-3aa1da130141039cb357efe36d7df9b9f6ae9b5b on host ip-172-31-38-137.us-east-2.compute.internal. [2021-01-30T13:23:46+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-01-30T13:23:46+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-01-30T13:23:46+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', 'be8f8674-fd5a-41b3-91b8-d8e1b1ddb901', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8', 'netstat', '-tuplen'] [2021-01-30T13:23:46+0000] [MainThread] [I] [toil-rt] 2021-01-30 13:23:46.361832: Running the command: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data --entrypoint /opt/cactus/wrapper.sh --name be8f8674-fd5a-41b3-91b8-d8e1b1ddb901 --rm quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8 netstat -tuplen" Running command catchsegv 'netstat' '-tuplen' [2021-01-30T13:23:47+0000] [MainThread] [I] [toil-rt] 2021-01-30 13:23:47.069766: Successfully ran: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data --entrypoint /opt/cactus/wrapper.sh --name be8f8674-fd5a-41b3-91b8-d8e1b1ddb901 --rm quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8 netstat -tuplen" in 0.6987 seconds [2021-01-30T13:23:47+0000] [MainThread] [I] [cactus.shared.common] Work dirs: {'/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb/tadup9nat'} [2021-01-30T13:23:47+0000] [MainThread] [I] [toil-rt] 2021-01-30 13:23:47.086108: Running the command: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data --entrypoint /opt/cactus/wrapper.sh -p 17459:17459 --name 0075c694-01b2-4799-a1e8-c229a0e9211b --rm quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8 ktserver -port 17459 -ls -tout 200000 -th 64 -bgs tadup9nat/snapshot -bgsc lzo -bgsi 1000000 -log tmps2xgk65x.tmp :#opts=ls#bnum=30m#msiz=50g#ktopts=p" WARNING: Published ports are discarded when using host network mode Running command catchsegv 'ktserver' '-port' '17459' '-ls' '-tout' '200000' '-th' '64' '-bgs' 'tadup9nat/snapshot' '-bgsc' 'lzo' '-bgsi' '1000000' '-log' 'tmps2xgk65x.tmp' ':#opts=ls#bnum=30m#msiz=50g#ktopts=p' [2021-01-30T13:23:48+0000] [MainThread] [I] [toil.lib.bioio] Ktserver running. [2021-01-30T13:23:48+0000] [MainThread] [I] [toil.lib.bioio] Ktserver running. [2021-01-30T13:23:48+0000] [MainThread] [I] [toil.lib.bioio] Ktserver running. [2021-01-30T13:23:48+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-01-30T13:23:48+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-01-30T13:23:48+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', 'caada4b1-65a3-4813-b0fa-44f596091dea', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8', 'ktremotemgr', 'get', '-port', '17459', '-host', '172.31.38.137', 'TERMINATE'] [2021-01-30T13:24:48+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-01-30T13:45:22+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-01-30T13:45:22+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '54b43861-0c6a-45ea-b267-4a4465027711', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8', 'ktremotemgr', 'get', '-port', '17459', '-host', '172.31.38.137', 'TERMINATE'] [2021-01-30T13:45:51+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-01-30T13:45:51+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-01-30T13:45:51+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '5b53b80c-c41e-4f85-8678-226b2f47fc79', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8', 'ktremotemgr', 'set', '-port', '17459', '-host', '172.31.38.137', 'TERMINATE', '1'] Running command catchsegv 'ktremotemgr' 'set' '-port' '17459' '-host' '172.31.38.137' 'TERMINATE' '1' [2021-01-30T13:46:24+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-01-30T13:46:24+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-01-30T13:46:24+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpj6e68m44/0ae72da4-21a2-4405-97b2-4dce169090cb:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '9989296f-4982-4e79-864d-9d2e5e3ebf7b', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:b50cdcdc102582359d5319857436a3ad6f46a9a8', 'ktremotemgr', 'get', '-port', '17459', '-host', '172.31.38.137', 'TERMINATE'] 1 [1]+ Interrupt eval "${options}" 0<&0 Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/toil/worker.py", line 368, in workerScript job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1424, in _runner returnValues = self._run(jobGraph, fileStore) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1780, in _run returnValues = self.run(fileStore) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1754, in run raise RuntimeError("Detected the error jobStoreID has been removed so exiting with an error") RuntimeError: Detected the error jobStoreID has been removed so exiting with an error [2021-01-30T13:46:42+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host ip-172-31-38-137.us-east-2.compute.internal <========= Due to failure we are reducing the remaining retry count of job 'DbServerService' 559c71c6-bb47-44ff-98e9-9bff2b07a11e with ID 559c71c6-bb47-44ff-98e9-9bff2b07a11e to 5

I restarted the workflow, these works still failed. During 8 days, cactus always only used 2 r3.8xlarge and 1 c4.8xlarge instances. The log was always "14 jobs are running, 0 jobs are issued and waiting to run".

I thought maybe this was due to low memory. Then I used both x1.32xlarge and r5ad.24xlarge instance to restart (cactus --nodeTypes x1.32xlarge:0.7,r5ad.24xlarge --minNodes 0,0 --maxNodes 5,2 --provisioner aws --batchSystem mesos --metrics aws:us-east-1:agis-2021-jobstore seqFile.txt result.hal --root ac --realTimeLogging --logFile cactus.log --targetTime 1 --restart). DBServiceService jobs and CactusCafPhase jobs starts running, but soon I got the log "Potentially deadlocked for 436 seconds. Waiting at most 3164 more seconds for any of 17 issued non-service jobs to schedule and start. Cluster may be too small".

What should I do to avoide the deadlock? If you could give me some suggestions, it will be appreciated.

glennhickey commented 3 years ago

It could be you don't have enough memory on an r3.8xlarge. I can imagine this being the case for fairly diverse, 3G-sized genomes. If you have a huge tree and are seeing deadlocks you can try using --maxServiceJobs to reduce the number of ktservers launched at any one time.

hehuiying1125 commented 3 years ago

Hi, Thank you for your suggestion . I have tried to use x1e.32xlarge, which with 3904G memory to replace r3.8xlarge to run cactus (cactus --nodeTypes c4.8xlarge:0.7,x1e.32xlarge --minNodes 0,0 --maxNodes 10,5 --provisioner aws --batchSystem mesos --metrics aws:us-east-1:agis-2021-jobstore seqFile.txt result.hal --root ac --realTimeLogging --logFile cactus.log --restart -- maxServiceJobs 2). The "CactusCafPhase" and "DbServerService" jobs still failed. The logs were as following:

Job 70 failed with exit status 1 and message '1' due to reason '{}' on executor '{'value': 'toil-3920'}' on agent '{'value': 'dcc38922-4aa1-4ded-a81e-208698236e14-S13'}'. Job failed with exit value 1: 'CactusCafWrapper' 80c275b3-4a02-4df2-bc01-20981d186351 The job seems to have left a log file, indicating failure: 'CactusCafWrapper' 80c275b3-4a02-4df2-bc01-20981d186351 Log from job 80c275b3-4a02-4df2-bc01-20981d186351 follows: =========> [2021-02-19T12:56:13+0000] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG--- [2021-02-19T12:56:13+0000] [MainThread] [I] [toil] Running Toil version 4.2.0-3aa1da130141039cb357efe36d7df9b9f6ae9b5b on host ip-172-31-7-79.us-east-2.compute.internal. [2021-02-19T12:56:15+0000] [MainThread] [I] [toil.lib.bioio] Alignments file: /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpr89qu2a/74f2b280-e5a0-458c-ac1d-0bc3746b6509/tmpf91nnzrc.tmp [2021-02-19T12:56:18+0000] [MainThread] [I] [cactus.shared.common] Work dirs: {'/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpr89qu2a/74f2b280-e5a0-458c-ac1d-0bc3746b6509'} [2021-02-19T12:56:18+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpr89qu2a/74f2b280-e5a0-458c-ac1d-0bc3746b6509 [2021-02-19T12:56:18+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8df2e-424c815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpr89qu2a/74f2b280-e5a0-458c-ac1d-0bc3746b6509:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '56f961cc-ecf7-4819-b815-8d7709e28e31', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'cactus_caf', '--logLevel', 'INFO', '--alignments', 'tmpf91nnzrc.tmp', '--cactusDisk', '\n\t\t\t\n\t\t\n\t', '--secondaryAlignments', 'tmp7ys9akus.tmp', '--annealingRounds', '128', '--deannealingRounds', '2 8', '--trim', '0 0', '--lastzArguments', '--step=1 --ambiguous=iupac,100,100 --ydrop=3000', '--minimumTreeCoverage', '0.0', '--blockTrim', '5.0', '--minimumDegree', '2', '--minimumSequenceLengthForBlast', '30', '-minimumIngroupDegree', '1', '--minimumOutgroupDegree', '0', '--alignmentFilter', 'filterSecondariesByMultipleSpecies', '-maxAdjacencyComponentSizeRatio', '50.0', '--realign', '--realignArguments', '--gapGamma 0.0 --matchGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99', '--phylogenyNumTrees', '30', '--phylogenyRootingMethod', 'bestRecon', '--phylogenyScoringMethod', 'reconCost', '--phylogenyBreakpointScalingFactor', '1.0', '- phylogenySkipSingleCopyBlocks', '--phylogenyMaxBaseDistance', '100', '--phylogenyMaxBlockDistance', '50', '--phylogenyTreeBuildingMethod', 'guidedNeighborJoining,splitDecomposition', '--phylogenyCostPerDupPerBase', '0.00', '--phylogenyCostPerLossPerBase', '0.02', '--referenceEventHeader', 'Anc249', '--phylogenyDoSplitsWithSupportHigherThanThisAllAtOnce', '0.44', '--numTreeBuildingThreads', '2', '--minimumBlockDegreeToCheckSupport', '10', '--minimumBlockHomologySupport', '0.05', '--removeRecoverableChains', 'unequalNumberOfIngroupCopies', '- minimumNumberOfSpecies', '1', '--maxRecoverableChainsIterations', '5', '--maxRecoverableChainLength', '500000', '--phylogenyHomologyUnitType', 'chain', '--phylogenyDistanceCorrectionMethod', 'jukesCantor', '--minLengthForChromosome', '1000000', '--proportionOfUnalignedBasesForNewChromosome', '0.8', '--maximumMedianSequenceLengthBetweenLinkedEnds', '1000'] [2021-02-19T12:56:18+0000] [MainThread] [I] [toil-rt] 2021-02-19 12:56:18.790998: Running the command: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpr89qu2a_/74f2b280-e5a0-458c-ac1d-0bc3746b6509:/data --entrypoint /opt/cactus/wrapper.sh --name 56f961cc-ecf7-4819-b815-8d7709e28e31 --rm quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73 cactus_caf --logLevel INFO --alignments tmpf91nnzrc.tmp --cactusDisk

                    </st_kv_database_conf>
             --secondaryAlignments tmp7ys9akus.tmp --annealingRounds 128 --deannealingRounds 2 8 --trim 0 0 --lastzArguments --step=1 --ambiguous=iupac,100,100 --ydrop=3000 --minimumTreeCoverage 0.0 --blockTrim 5.0 --minimumDegree 2 --minimumSequenceLengthForBlast 30 --minimumIngroupDegree 1 --minimumOutgroupDegree 0 --alignmentFilter filterSecondariesByMultipleSpecies --maxAdjacencyComponentSizeRatio 50.0 --realign --realignArguments --gapGamma 0.0 --matchGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99 --phylogenyNumTrees 30 --phylogenyRootingMethod bestRecon --phylogenyScoringMethod  <...> rectionMethod jukesCantor --minLengthForChromosome 1000000 --proportionOfUnalignedBasesForNewChromosome 0.8 --maximumMedianSequenceLengthBetweenLinkedEnds 1000" (features={'alignmentsSize': 79661390})
    Running command catchsegv 'cactus_caf' '--logLevel' 'INFO' '--alignments' 'tmpf91nnzrc.tmp' '--cactusDisk' '<st_kv_database_conf 

type="kyoto_tycoon">

                    </st_kv_database_conf>
            ' '--secondaryAlignments' 'tmp7ys9akus.tmp' '--annealingRounds' '128' '--deannealingRounds' '2 8' '--trim' '0 0' '--lastzArguments' '--step=1 --ambiguous=iupac,100,100 --ydrop=3000' '--minimumTreeCoverage' '0.0' '--blockTrim' '5.0' '--minimumDegree' '2' '--minimumSequenceLengthForBlast' '30' '--minimumIngroupDegree' '1' '--minimumOutgroupDegree' '0' '--alignmentFilter' 'filterSecondariesByMultipleSpecies' '--maxAdjacencyComponentSizeRatio' '50.0' '--realign' '--realignArguments' '--gapGamma 0.0 --matchGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99' '--phylogenyNumTrees' '30' '--phylogenyRootingMethod' 'bestRecon' '--phylogenyScoringMethod' 'reconCost' '--phylogenyBreakpointScalingFactor' '1.0' '--phylogenySkipSingleCopyBlocks' '--phylogenyMaxBaseDistance' '100' '--phylogenyMaxBlockDistance' '50' '--phylogenyTreeBuildingMethod' 'guidedNeighborJoining,splitDecomposition' '--phylogenyCostPerDupPerBase' '0.00' '--phylogenyCostPerLossPerBase' '0.02' '--referenceEventHeader' 'Anc249' '--phylogenyDoSplitsWithSupportHigherThanThisAllAtOnce' '0.44' '--numTreeBuildingThreads' '2' '--minimumBlockDegreeToCheckSupport' '10' '--minimumBlockHomologySupport' '0.05' '--removeRecoverableChains' 'unequalNumberOfIngroupCopies' '--minimumNumberOfSpecies''1' '--maxRecoverableChainsIterations' '5' '--maxRecoverableChainLength' '500000' '--phylogenyHomologyUnitType' 'chain' '--phylogenyDistanceCorrectionMethod' 'jukesCantor' '--minLengthForChromosome' '1000000' '- proportionOfUnalignedBasesForNewChromosome' '0.8' '--maximumMedianSequenceLengthBetweenLinkedEnds' '1000'
    Flower disk name : <st_kv_database_conf type="kyoto_tycoon">
                            <kyoto_tycoon database_dir="fakepath" host="172.31.7.79" port="19215" />
                    </st_kv_database_conf>

    Set up the flower disk
    **cactus_caf: impl/pinchIterator.c:103: pairwiseAlignmentToPinch_getNext: Assertion `pA->xCoordinate == pA->pairwiseAlignment->end1' failed.**
    Aborted (core dumped)
    [2021-02-19T12:56:35+0000] [MainThread] [I] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Max memory used for job CactusCafW

rapper (tool cactus_caf) on JSON features {"alignmentsSize": 79661390}: 0 Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/toil/worker.py", line 368, in workerScript job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer) File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/shared/common.py", line 1424, in _runner fileStore=fileStore, **kwargs) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1424, in _runnerreturnValues = self._run(jobGraph, fileStore) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1361, in _run return self.run(fileStore) File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/pipeline/cactus_workflow.py", line 856, in run constraints=constraints) File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/pipeline/cactus_workflow.py", line 838, in runCactusCafInWorkflow maxRecoverableChainLength=self.getOptionalPhaseAttrib("maxRecoverableChainLength", int)) File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/shared/common.py", line 435, in runCactusCaf features=features, job_name=jobName, fileStore=fileStore) File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/shared/common.py", line 1357, in cactuscall raise RuntimeError("Command {} exited {}: {}".format(call, process.returncode, out)) RuntimeError: Command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/nod e-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpr89qu2a/74f2b280-e5a0-458c-ac1d-0bc3746b6509:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '56f961cc-ecf7-4819-b815-8d7709e28e31', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'cactus_caf', '--logLevel', 'INFO', '--alignments', 'tmpf91nnzrc.tmp', '--cactusDisk', '\n\t\t\t\n\t\t\n\t', '--secondaryAlignments', 'tmp7ys9akus.tmp', '--annealingRounds', '128', '--deannealingRounds', '2 8', '--trim', '0 0', '--lastzArguments', '--step=1 --ambiguous=iupac,100,100 --ydrop=3000', '--minimumTreeCoverage', '0.0', '--blockTrim', '5.0', '--minimumDegree', '2', '--minimumSequenceLengthForBlast', '30', '--minimumIngroupDegree', '1', '--minimumOutgroupDegree', '0', '--alignmentFilter', 'filterSecondariesByMultipleSpecies', '--maxAdjacencyComponentSizeRatio', '50.0', '--realign', '--realignArguments', '--gapGamma 0.0 --matchGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99', '--phylogenyNumTrees', '30', '--phylogenyRootingMethod', 'bestRecon', '--phylogenyScoringMethod', 'reconCost', '--phylogenyBreakpointScalingFactor', '1.0', '--phylogenySkipSingleCopyBlocks', '--phylogenyMaxBaseDistance', '100', '--phylogenyMaxBlockDistance', '50', '--phylogenyTreeBuildingMethod', 'guidedNeighborJoining,splitDecomposition', '--phylogenyCostPerDupPerBase', '0.00', '--phylogenyCostPerLossPerBase', '0.02', '--referenceEventHeader', 'Anc249','-phylogenyDoSplitsWithSupportHigherThanThisAllAtOnce', '0.44', '--numTreeBuildingThreads', '2', '--minimumBlockDegreeToCheckSupport', '10', '--minimumBlockHomologySupport', '0.05', '--removeRecoverableChains', 'unequalNumberOfIngroupCopies', '--minimumNumberOfSpecies', '1', '--maxRecoverableChainsIterations', '5', '--maxRecoverableChainLength', '500000', '--phylogenyHomologyUnitType', 'chain', '--phylogenyDistanceCorrectionMethod', 'jukesCantor', '--minLengthForChromosome', '1000000', '--proportionOfUnalignedBasesForNewChromosome', '0.8', '--maximumMedianSequenceLengthBetweenLinkedEnds', '1000'] exited 1: stdout= [2021-02-19T12:56:35+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host ip-172-31-7-79.us-east-2.compute.internal <========= Due to failure we are reducing the remaining retry count of job 'CactusCafWrapper' 80c275b3-4a02-4df2-bc01-20981d186351 with ID 80c275b3-4a02-4df2-bc01-20981d186351 to 5

some other failed log of CactusCafPhase was as:

Job failed with exit value 1: 'CactusCafWrapper' 08edb678-c797-428d-b914-3f3b440c2e3f The job seems to have left a log file, indicating failure: 'CactusCafWrapper' 08edb678-c797-428d-b914-3f3b440c2e3f Log from job 08edb678-c797-428d-b914-3f3b440c2e3f follows: =========> [2021-02-19T12:56:43+0000] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG--- [2021-02-19T12:56:43+0000] [MainThread] [I] [toil] Running Toil version 4.2.0-3aa1da130141039cb357efe36d7df9b9f6ae9b5b on host ip-172-31-7-79.us-east-2.compute.internal. [2021-02-19T12:56:44+0000] [MainThread] [I] [toil.lib.bioio] Alignments file: /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmp1jzoqoxy/affe06c0-e81d-49fb-adbe-d6e35fd82787/tmpxrvd2tc7.tmp [2021-02-19T12:56:45+0000] [MainThread] [I] [cactus.shared.common] Work dirs: {'/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmp1jzoqoxy/affe06c0-e81d-49fb-adbe-d6e35fd82787'} [2021-02-19T12:56:45+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmp1jzoqoxy/affe06c0-e81d-49fb-adbe-d6e35fd82787 [2021-02-19T12:56:45+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmp1jzoqoxy/affe06c0-e81d-49fb-adbe-d6e35fd82787:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', 'a54d7df3-b0f9-4781-b08d-9731fa24fce6', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'cactus_caf', '--logLevel', ' INFO', '--alignments', 'tmpxrvd2tc7.tmp', '--cactusDisk', '\n\t\t\t\n\t\t\n\t', '--secondaryAlignments', 'tmp7ewd34u0.tmp', '--annealingRounds', '128', '--deannealingRounds', '2 8', '--trim', '0 0', '--lastzArguments', '--step=1 --ambiguous=iupac,100,100 --ydrop=3000', '--minimumTreeCoverage', '0.0', '--blockTrim', '5.0', '--minimumDegree', '2', '--minimumSequenceLengthForBlast', '30', '--minimumIngroupDegree', '1', '--minimumOutgroupDegree', '0', '--alignmentFilter', 'filterSecondariesByMultipleSpecies', '-maxAdjacencyComponentSizeRatio', '50. 0', '--realign', '--realignArguments', '--gapGamma 0.0 --matchGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99', '--phylogenyNumTrees', '30', '--phylogenyRootingMethod', 'bestRecon', '--phylogenyScoringMethod', 'reconCost', '--phylogenyBreakpointScalingFactor', '1.0', '--phylogenySkipSingleCopyBlocks', '--phylogenyMaxBaseDistance', '100', '--phylogenyMaxBlockDistance', '50', '--phylogenyTreeBuildingMethod', 'guidedNeighborJoining,splitDecomposition', '--phylogenyCostPerDupPerBase', '0.00', '--phylogenyCostPerLossPerBase', '0.02', '--referenceEventHeader', 'Anc202', '--phylogenyDoSplitsWithSupportHigherThanThisAllAtOnce', '0.44', '--numTreeBuildingThreads', '2', '--minimumBlockDegreeToCheckSupport', '10', '--minimumBlockHomologySupport', '0.05', '--removeRecoverableChains', 'unequalNumberOfIngroupCopies', '--minimumNumberOfSpecies', '1', '--maxRecoverableChainsIterations', '5', '--maxRecoverableChainLength', '500000', '--phylogenyHomologyUnitType', 'chain', '--phylogenyDistanceCorrec tionMethod', 'jukesCantor', '--minLengthForChromosome', '1000000', '--proportionOfUnalignedBasesForNewChromosome', '0.8', '--maximumMedianSequenceLengthBetweenLinkedEnds', '1000'] [2021-02-19T12:56:45+0000] [MainThread] [I] [toil-rt] 2021-02-19 12:56:45.532929: Running the command: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmp1jzoqoxy/affe06c0-e81d-49fb-adbe-d6e35fd82787:/data --entrypoint /opt/cactus/wrapper.sh --name a54d7df3-b0f9-4781-b08d-9731fa24fce6 --rm quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73 cactus_caf --logLevel INFO --alignments tmpxrvd2tc7.tmp -- cactusDisk

                    </st_kv_database_conf>
             --secondaryAlignments tmp7ewd34u0.tmp --annealingRounds 128 --deannealingRounds 2 8 --trim 0 0 --lastzArguments --step=1--ambiguous=iupac,100,100 --ydrop=3000 --minimumTreeCoverage 0.0 --blockTrim 5.0 --minimumDegree 2 --minimumSequenceLengthForBlast 30 --minimumIngroupDegree 1 --minimumOutgroupDegree 0 --alignmentFilter filterSecondariesByMultipleSpecies --maxAdjacencyComponentSizeRatio 50.0 --realign --realignArguments --gapGamma 0.0 --matchGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonal

Trim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99 --phylogenyNumTrees 30 --phylogenyRootingMethod bestRecon --phylogenyScoringMethod r <...> rectionMethod jukesCantor --minLengthForChromosome 1000000 --proportionOfUnalignedBasesForNewChromosome 0.8 --maximumM edianSequenceLengthBetweenLinkedEnds 1000" (features={'alignmentsSize': 59192340}) Running command catchsegv 'cactus_caf' '--logLevel' 'INFO' '--alignments' 'tmpxrvd2tc7.tmp' '--cactusDisk' '

                    </st_kv_database_conf>
            ' '--secondaryAlignments' 'tmp7ewd34u0.tmp' '--annealingRounds' '128' '--deannealingRounds' '2 8' '--trim' '0 0' '--lastzArguments' '--step=1 --ambiguous=iupac,100,100 --ydrop=3000' '--minimumTreeCoverage' '0.0' '--blockTrim' '5.0' '--minimumDegree' '2' '--minimumSequenceLengthForBlast' '30' '--minimumIngroupDegree' '1' '--minimumOutgroupDegree' '0' '--alignmentFilter' 'filterSecondariesByMultipleSpecies' '--maxAdjacencyComponentSizeRatio' '50.0' '--realign' '--realignArguments' '--gapGamma 0.0 --matchGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99' '--phylogenyN

umTrees' '30' '--phylogenyRootingMethod' 'bestRecon' '--phylogenyScoringMethod' 'reconCost' '-phylogenyBreakpointScalingFactor' '1.0' '--phylogenySkipSingleCopyBlocks' '--phylogenyMaxBaseDistance' '100' '--phylogenyMaxBlockDistance' '50' '--phylogenyTreeBuildingMethod' 'guidedNeighborJoining,splitDecomposition' '--phylogenyCostPerDupPerBase' '0.00' '--phylogenyCostPerLossPerBase' '0.02' '--referenceEventHea der' 'Anc202' '--phylogenyDoSplitsWithSupportHigherThanThisAllAtOnce' '0.44' '--numTreeBuildingThreads' '2' '--minimumBlockDegreeToCheckSupport' '10' '--minimumBlockHomologySupport' '0.05' '--removeRecoverableChains' 'unequalNumberOfIngroupCopies' '--minimumNumberOfSpecies' '1' '--maxRecoverableChainsIterations' '5' '--maxRecoverableChainLength' '500000' '--phylogenyHomologyUnitType' 'chain' '--phylogenyDist anceCorrectionMethod' 'jukesCantor' '--minLengthForChromosome' '1000000' '-proportionOfUnalignedBasesForNewChromosome' '0.8' '--maximumMedianSequenceLengthBetweenLinkedEnds' '1000' Flower disk name :

                    </st_kv_database_conf>

    Set up the flower disk
    Segmentation fault (core dumped)
    Error: No such object: a54d7df3-b0f9-4781-b08d-9731fa24fce6
    [2021-02-19T12:56:55+0000] [MainThread] [I] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Max memory used for job CactusCafWrapper (tool cactus_caf) on JSON features {"alignmentsSize": 59192340}: 0
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/toil/worker.py", line 368, in workerScript
        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer)
      File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/shared/common.py", line 1424, in _runner
        fileStore=fileStore, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1424, in _runner
        returnValues = self._run(jobGraph, fileStore)
      File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1361, in _run
        return self.run(fileStore)
      File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/pipeline/cactus_workflow.py", line 856, in run
        constraints=constraints)
      File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/pipeline/cactus_workflow.py", line 838, in runCactusCafInWorkflow
        maxRecoverableChainLength=self.getOptionalPhaseAttrib("maxRecoverableChainLength", int))
      File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/shared/common.py", line 435, in runCactusCaf
        features=features, job_name=jobName, fileStore=fileStore)
      File "/tmp/tmpzxn0iz7z/15581c9638e8049f6add80795593f0f8/cactus/shared/common.py", line 1357, in cactus_call
        raise RuntimeError("Command {} exited {}: {}".format(call, process.returncode, out))
    RuntimeError: Command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/nod

e-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmp1jzoqoxy/affe06c0-e81d-49fb-adbe-d6e35fd82787:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', 'a54d7df3-b0f9-4781-b08d-9731fa24fce6', '--rm', 'quay.io/comparative-genomics-toolkit/cactus: 8c3a823684f45d44b9fa98420302fe724c652f73', 'cactus_caf', '--logLevel', 'INFO', '--alignments', 'tmpxrvd2tc7.tmp', '--cactusDisk', '\n\t\t\t\n\t\t\n\t', '--secondaryAlignments', 'tmp7ewd34u0.tmp', '--annealingRounds', '128', '--deannealingRounds', '2 8', '--trim', '0 0', '--lastzArguments', '--step=1 --ambiguous=iupac,100,100 --ydrop=3000', '--minimumTreeCoverage', '0.0', '--blockTrim', '5.0', '--minimumDegree', '2', '--minimumSequenceLengthForBlast', '30', '--minimumIngroupDegree', '1', '--minimumOutgroupDegree', '0', '--alignmentFilter', 'filterSecondariesByMultipleSpecies', '--maxAdjacencyComponentSizeRatio', '50.0', '--realign', '--realignArguments', '--gapGamma 0.0 --matchGam ma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99', '--phylogenyNumTrees', '30', '--phylogenyRootingMethod', 'bestRecon', '--phylogenyScoringMethod', 'reconCost', '--phylogenyBreakpointScalingFactor', '1.0', '--phylogenySkipSingleCopyBlocks', '--phylogenyMaxBaseDistance', '100', '--phylogenyMaxBlockDistance', '50','--phylogenyTreeBuildingMethod', 'guidedNeighborJoining,splitDecomposition', '--phylogenyCostPerDupPerBase', '0.00', '--phylogenyCostPer LossPerBase', '0.02', '--referenceEventHeader', 'Anc202', '--phylogenyDoSplitsWithSupportHigherThanThisAllAtOnce', '0.44', '--numTreeBuildingThreads', '2', '--minimumBlockDegreeToCheckSupport', '10', '--minimumBlockHomologySupport', '0.05', '--removeRecoverableChains', 'unequalNumberOfIngroupCopies', '--minimumNumberOfSpecies', '1', '--maxRecoverableChainsIterations', '5', '--maxRecoverableChainLength', '500 000', '--phylogenyHomologyUnitType', 'chain', '--phylogenyDistanceCorrectionMethod', 'jukesCantor', '-minLengthForChromosome', '1000000' , '--proportionOfUnalignedBasesForNewChromosome', '0.8', '--maximumMedianSequenceLengthBetweenLinkedEnds', '1000'] exited 1: stdout=*** Segmentation fault Register dump:

     RAX: 00000000005d3abe   RBX: 0000000000000000   RCX: 0000000000000001
     RDX: 0000000000000000   RSI: 00000000005d3abe   RDI: 0000000000000000
     RBP: 00000000005d3abe   R8 : 0000000000000001   R9 : 0000000000000000
     R10: 0000556e0015b010   R11: 0000000000000000   R12: 0000556e14769730
     R13: 0000556e14769770   R14: 0000556e06b6e380   R15: 0000000000000001
     RSP: 00007fff1dc74918

     RIP: 0000556dff3e8d70   EFLAGS: 00010293

     CS: 0033   FS: 0000   GS: 0000

     Trap: 0000000e   Error: 00000004   OldMask: 00001000   CR2: 00000018

     FPUCW: 0000037f   FPUSW: 00000000   TAG: 00000000
     RIP: 00000000   RDP: 00000000

     ST(0) 0000 0000000000000000   ST(1) 0000 0000000000000000
     ST(2) 0000 0000000000000000   ST(3) 0000 0000000000000000
     ST(4) 0000 0000000000000000   ST(5) 0000 0000000000000000
     ST(6) 0000 0000000000000000   ST(7) 0000 0000000000000000
     mxcsr: 1fa0
     XMM0:  000000000000000000000000ffffffd0 XMM1:  000000000000000000000000ffffffd0
     XMM2:  000000000000000000000000ffffffd0 XMM3:  000000000000000000000000ffffffd0
     XMM4:  000000000000000000000000ffffffd0 XMM5:  000000000000000000000000ffffffd0
     XMM6:  000000000000000000000000ffffffd0 XMM7:  000000000000000000000000ffffffd0
     XMM8:  000000000000000000000000ffffffd0 XMM9:  000000000000000000000000ffffffd0
     XMM10: 000000000000000000000000ffffffd0 XMM11: 000000000000000000000000ffffffd0
     XMM12: 000000000000000000000000ffffffd0 XMM13: 000000000000000000000000ffffffd0
     XMM14: 000000000000000000000000ffffffd0 XMM15: 000000000000000000000000ffffffd0

    Backtrace:
    cactus_caf(+0x1ad70)[0x556dff3e8d70]
    cactus_caf(+0x1b90f)[0x556dff3e990f]
    cactus_caf(+0x1b99a)[0x556dff3e999a]
    cactus_caf(+0x9b28)[0x556dff3d7b28]
    cactus_caf(+0x9c02)[0x556dff3d7c02]
    cactus_caf(+0x7a9e)[0x556dff3d5a9e]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f02f4704b97]
    cactus_caf(+0x8a6a)[0x556dff3d6a6a]

    Memory map:

    556dff3ce000-556dff471000 r-xp 00000000 09:00 207096248 /usr/local/bin/cactus_caf
    556dff670000-556dff671000 r--p 000a2000 09:00 207096248 /usr/local/bin/cactus_caf
    556dff671000-556dff672000 rw-p 000a3000 09:00 207096248 /usr/local/bin/cactus_caf
    556dff672000-556dff68b000 rw-p 00000000 00:00 0
    556e0015b000-556e15a94000 rw-p 00000000 00:00 0 [heap]
    7f02f3f26000-7f02f44d3000 rw-p 00000000 00:00 0
    7f02f44d3000-7f02f44e2000 r-xp 00000000 09:00 206574512 /lib/x86_64-linux-gnu/libbz2.so.1.0.4
    7f02f44e2000-7f02f46e1000 ---p 0000f000 09:00 206574512 /lib/x86_64-linux-gnu/libbz2.so.1.0.4
    7f02f46e1000-7f02f46e2000 r--p 0000e000 09:00 206574512 /lib/x86_64-linux-gnu/libbz2.so.1.0.4
    7f02f46e2000-7f02f46e3000 rw-p 0000f000 09:00 206574512 /lib/x86_64-linux-gnu/libbz2.so.1.0.4
    7f02f46e3000-7f02f48ca000 r-xp 00000000 09:00 206574513 /lib/x86_64-linux-gnu/libc-2.27.so
    7f02f48ca000-7f02f4aca000 ---p 001e7000 09:00 206574513 /lib/x86_64-linux-gnu/libc-2.27.so
    7f02f4aca000-7f02f4ace000 r--p 001e7000 09:00 206574513 /lib/x86_64-linux-gnu/libc-2.27.so
    7f02f4ace000-7f02f4ad0000 rw-p 001eb000 09:00 206574513 /lib/x86_64-linux-gnu/libc-2.27.so
    7f02f4ad0000-7f02f4ad4000 rw-p 00000000 00:00 0
    7f02f4ad4000-7f02f4aeb000 r-xp 00000000 09:00 206831959 /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f02f4aeb000-7f02f4cea000 ---p 00017000 09:00 206831959 /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f02f4cea000-7f02f4ceb000 r--p 00016000 09:00 206831959 /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f02f4ceb000-7f02f4cec000 rw-p 00017000 09:00 206831959 /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f02f4cec000-7f02f4cf8000 r-xp 00000000 09:00 206833843 /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
    7f02f4cf8000-7f02f4ef7000 ---p 0000c000 09:00 206833843 /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
    7f02f4ef7000-7f02f4ef8000 r--p 0000b000 09:00 206833843 /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
    7f02f4ef8000-7f02f4ef9000 rw-p 0000c000 09:00 206833843 /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
    7f02f4ef9000-7f02f4fff000 r-xp 00000000 09:00 206833885 /usr/lib/x86_64-linux-gnu/libtokyocabinet.so.9.11.0
    7f02f4fff000-7f02f51fe000 ---p 00106000 09:00 206833885 /usr/lib/x86_64-linux-gnu/libtokyocabinet.so.9.11.0
    7f02f51fe000-7f02f51ff000 r--p 00105000 09:00 206833885 /usr/lib/x86_64-linux-gnu/libtokyocabinet.so.9.11.0
    7f02f51ff000-7f02f5200000 rw-p 00106000 09:00 206833885 /usr/lib/x86_64-linux-gnu/libtokyocabinet.so.9.11.0
    7f02f5200000-7f02f5379000 r-xp 00000000 09:00 206833881 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
    7f02f5379000-7f02f5579000 ---p 00179000 09:00 206833881 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
    7f02f5579000-7f02f5583000 r--p 00179000 09:00 206833881 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
    7f02f5583000-7f02f5585000 rw-p 00183000 09:00 206833881 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
    7f02f5585000-7f02f5589000 rw-p 00000000 00:00 0
    7f02f5589000-7f02f5726000 r-xp 00000000 09:00 206574538 /lib/x86_64-linux-gnu/libm-2.27.so
    7f02f5726000-7f02f5925000 ---p 0019d000 09:00 206574538 /lib/x86_64-linux-gnu/libm-2.27.so
    7f02f5925000-7f02f5926000 r--p 0019c000 09:00 206574538 /lib/x86_64-linux-gnu/libm-2.27.so
    7f02f5926000-7f02f5927000 rw-p 0019d000 09:00 206574538 /lib/x86_64-linux-gnu/libm-2.27.so
    7f02f5927000-7f02f5941000 r-xp 00000000 09:00 206574574 /lib/x86_64-linux-gnu/libpthread-2.27.so
    7f02f5941000-7f02f5b40000 ---p 0001a000 09:00 206574574 /lib/x86_64-linux-gnu/libpthread-2.27.so
    7f02f5b40000-7f02f5b41000 r--p 00019000 09:00 206574574 /lib/x86_64-linux-gnu/libpthread-2.27.so
    7f02f5b41000-7f02f5b42000 rw-p 0001a000 09:00 206574574 /lib/x86_64-linux-gnu/libpthread-2.27.so
    7f02f5b42000-7f02f5b46000 rw-p 00000000 00:00 0
    7f02f5b46000-7f02f5b62000 r-xp 00000000 09:00 206574601 /lib/x86_64-linux-gnu/libz.so.1.2.11
    7f02f5b62000-7f02f5d61000 ---p 0001c000 09:00 206574601 /lib/x86_64-linux-gnu/libz.so.1.2.11
    7f02f5d61000-7f02f5d62000 r--p 0001b000 09:00 206574601 /lib/x86_64-linux-gnu/libz.so.1.2.11
    7f02f5d62000-7f02f5d63000 rw-p 0001c000 09:00 206574601 /lib/x86_64-linux-gnu/libz.so.1.2.11
    7f02f5d63000-7f02f5d67000 r-xp 00000000 09:00 206574499 /lib/x86_64-linux-gnu/libSegFault.so
    7f02f5d67000-7f02f5f66000 ---p 00004000 09:00 206574499 /lib/x86_64-linux-gnu/libSegFault.so
    7f02f5f66000-7f02f5f67000 r--p 00003000 09:00 206574499 /lib/x86_64-linux-gnu/libSegFault.so
    7f02f5f67000-7f02f5f68000 rw-p 00004000 09:00 206574499 /lib/x86_64-linux-gnu/libSegFault.so
    7f02f5f68000-7f02f5f8f000 r-xp 00000000 09:00 206574495 /lib/x86_64-linux-gnu/ld-2.27.so
    7f02f6183000-7f02f618a000 rw-p 00000000 00:00 0
    7f02f618d000-7f02f618f000 rw-p 00000000 00:00 0
    7f02f618f000-7f02f6190000 r--p 00027000 09:00 206574495 /lib/x86_64-linux-gnu/ld-2.27.so
    7f02f6190000-7f02f6191000 rw-p 00028000 09:00 206574495 /lib/x86_64-linux-gnu/ld-2.27.so
    7f02f6191000-7f02f6192000 rw-p 00000000 00:00 0
    7fff1dc55000-7fff1dc76000 rw-p 00000000 00:00 0 [stack]
    7fff1dd6e000-7fff1dd71000 r--p 00000000 00:00 0 [vvar]
    7fff1dd71000-7fff1dd72000 r-xp 00000000 00:00 0 [vdso]
    ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]

    [2021-02-19T12:56:55+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host ip-172-31-7-79.us-east-2.compute.internal

The failed logs for DBServerService was as:

Job ended: 'DbServerService' 5f4d5186-d5a8-4f43-b18a-6ea2a84de9d3 The job seems to have left a log file, indicating failure: 'DbServerService' 5f4d5186-d5a8-4f43-b18a-6ea2a84de9d3 Log from job 5f4d5186-d5a8-4f43-b18a-6ea2a84de9d3 follows: =========> [2021-02-19T12:48:15+0000] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG--- [2021-02-19T12:48:15+0000] [MainThread] [I] [toil] Running Toil version 4.2.0-3aa1da130141039cb357efe36d7df9b9f6ae9b5b on host ip-172-31-7-79.us-east-2.compute.internal. [2021-02-19T12:48:16+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:48:16+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:48:16+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '56c76301-d76b-4487-859b-cec20d1649e6', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'netstat', '-tuplen'] [2021-02-19T12:48:16+0000] [MainThread] [I] [toil-rt] 2021-02-19 12:48:16.361200: Running the command: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data --entrypoint /opt/cactus/wrapper.sh --name 56c76301-d76b-4487-859b-cec20d1649e6 --rm quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73 netstat -tuplen" Running command catchsegv 'netstat' '-tuplen' [2021-02-19T12:48:16+0000] [MainThread] [I] [toil-rt] 2021-02-19 12:48:16.711988: Successfully ran: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data --entrypoint /opt/cactus/wrapper.sh --name 56c76301-d76b-4487-859b-cec20d1649e6 --rm quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73 netstat -tuplen" in 0.3417 seconds [2021-02-19T12:48:16+0000] [MainThread] [I] [cactus.shared.common] Work dirs: {'/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f/tv8hoi7ab', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f'} [2021-02-19T12:48:16+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: /var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f [2021-02-19T12:48:16+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '-p', '8336:8336', '--name', '08887f41-94d2-442c-b3e8-94bc407fa678', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktserver', '-port', '8336', '-ls', '-tout', '200000', '-th', '64', '-bgs', 'tv8hoi7ab/snapshot', '-bgsc', 'lzo', '-bgsi', '1000000', '-log', 'tmp6p50znme.tmp', ':#opts=ls#bnum=30m#msiz=50g#ktopts=p'] [2021-02-19T12:48:16+0000] [MainThread] [I] [toil-rt] 2021-02-19 12:48:16.723893: Running the command: "docker run --interactive --net=host --log-driver=none -u 0:0 -v /var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data --entrypoint /opt/cactus/wrapper.sh -p 8336:8336 --name 08887f41-94d2-442c-b3e8-94bc407fa678 --rm quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73 ktserver -port 8336 -ls -tout 200000 -th 64 -bgs tv8hoi7ab/snapshot -bgsc lzo -bgsi 1000000 -log tmp6p50znme.tmp :#opts=ls#bnum=30m#msiz=50g#ktopts=p" WARNING: Published ports are discarded when using host network mode Running command catchsegv 'ktserver' '-port' '8336' '-ls' '-tout' '200000' '-th' '64' '-bgs' 'tv8hoi7ab/snapshot' '-bgsc' 'lzo' '-bgsi' '1000000' '-log' 'tmp6p50znme.tmp' ':#opts=ls#bnum=30m#msiz=50g#ktopts=p' [2021-02-19T12:48:17+0000] [MainThread] [I] [toil.lib.bioio] Ktserver running. [2021-02-19T12:48:17+0000] [MainThread] [I] [toil.lib.bioio] Ktserver running. [2021-02-19T12:48:17+0000] [MainThread] [I] [toil.lib.bioio] Ktserver running. [2021-02-19T12:48:17+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:48:17+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:48:17+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '901d177e-30e7-4db8-8cca-cd2b2f0b0dba', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:49:18+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:49:18+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:49:18+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', 'ce11b880-2c1e-4ac6-9505-c508dc366adf', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:50:18+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:50:18+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:50:18+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '1c08f8ca-ccd2-4d0b-9231-1ee1fda1db9e', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:51:19+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:51:19+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:51:19+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '98440a01-f352-4793-83e5-216eed5b3c75', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:52:23+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:52:23+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:52:23+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '064e45ca-95a6-4c07-865b-eacb812fb1ba', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:53:24+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:53:24+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:53:24+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net= host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', 'c17f2e5c-b953-49ed-a26f-741bb8e3a35e', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:54:26+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:54:26+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:54:26+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '34c68a86-9c02-4abb-9926-333ce4b80660', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:55:27+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:55:27+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:55:27+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name','1fdcd083-7a4a-4bfa-9dbf-646d375f8a71', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr','get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:56:28+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:56:28+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:56:28+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', 'ea3a008b-4d4b-439b-8f65-6328f9f325ef', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:57:28+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:57:28+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:57:28+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/ tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '9e9d4ceb-5a04-42e6-83b4-55470326544c', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] [2021-02-19T12:58:20+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:58:20+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:58:20+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815b-db37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '738344be-cf92-48b1-8137-7cc62f522247', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'set', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE', '1'] Running command catchsegv 'ktremotemgr' 'set' '-port' '8336' '-host' '172.31.7.79' 'TERMINATE' '1' [2021-02-19T12:58:29+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:58:29+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:58:29+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net= host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '17bdc9be-59fe-4899-b72c-d55c79842edc', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] 1 [1]+ Interrupt eval "${options}" 0<&0 Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/toil/worker.py", line 368, in workerScript job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1424, in _runner returnValues = self._run(jobGraph, fileStore) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1780, in _run returnValues = self.run(fileStore) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1754, in run raise RuntimeError("Detected the error jobStoreID has been removed so exiting with an error") RuntimeError: Detected the error jobStoreID has been removed so exiting with an error [2021-02-19T12:58:40+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host ip-172-31-7-79.us-east-2.compute.internal <========= 62f522247', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'set', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE', '1'] Running command catchsegv 'ktremotemgr' 'set' '-port' '8336' '-host' '172.31.7.79' 'TERMINATE' '1' [2021-02-19T12:58:29+0000] [MainThread] [I] [cactus.shared.common] Work dirs: set() [2021-02-19T12:58:29+0000] [MainThread] [I] [cactus.shared.common] Docker work dir: . [2021-02-19T12:58:29+0000] [MainThread] [I] [cactus.shared.common] Running the command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/node-2f6f2de8-df2e-424c-815bdb37a6a59679-4ad9c8e3b64c0936af8834905f342da0/tmpb19qp20n/e8fab894-943b-4a76-8fd7-79785044449f:/data', '--entrypoint', '/opt/cactus/wrapper.sh', '--name', '17bdc9be-59fe-4899-b72c-d55c79842edc', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:8c3a823684f45d44b9fa98420302fe724c652f73', 'ktremotemgr', 'get', '-port', '8336', '-host', '172.31.7.79', 'TERMINATE'] 1 [1]+ Interrupt eval "${options}" 0<&0 Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/toil/worker.py", line 368, in workerScript job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1424, in _runner returnValues = self._run(jobGraph, fileStore) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1780, in _run returnValues = self.run(fileStore) File "/usr/local/lib/python3.6/dist-packages/toil/job.py", line 1754, in run raise RuntimeError("Detected the error jobStoreID has been removed so exiting with an error") RuntimeError: Detected the error jobStoreID has been removed so exiting with an error [2021-02-19T12:58:40+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host ip-172-31-7-79.us-east-2.compute.internal <========= Job 'DbServerService' 5f4d5186-d5a8-4f43-b18a-6ea2a84de9d3 with ID 5f4d5186-d5a8-4f43-b18a-6ea2a84de9d3 is completely failed

What else could I do to make it work successfully? If you could give me some suggestions, it will be appreciated.

glennhickey commented 3 years ago

I think the relevant part of the log is

Set up the flower disk
    **cactus_caf: impl/pinchIterator.c:103: pairwiseAlignmentToPinch_getNext: Assertion `pA->xCoordinate == pA->pairwiseAlignment->end1' failed.**
    Aborted (core dumped)
    [2021-02-19T12:56:35+0000] [MainThread] [I] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Max memory used for job CactusCafW

Which, unfortunately, looks to me like a bug. Either in cactus_caf or the lastz phase that produced its input. I guess the question is how do you salvage the alignment you have so far? And I'm not so sure of the answer. Will ping @joelarmstrong and @diekhans how have more experience than I do with wrangling massive jobs.

hehuiying1125 commented 3 years ago

Thank you very much for point the relevant part of the log, Glenn. Yes, I want to salvage the alignment I have so far. Looking forward to@joelarmstrong and @diekhans 's suggest.

diekhans commented 3 years ago

Unless you specified intermediateResultsUrl, it is extremely difficult to restart a failed run. It requires a lot of knowledge of how Toil and Cactus work to do it.

hehuiying1125 commented 3 years ago

Hi , Thank you for your reply. I am willing to specified intermediateResultsUrl, do you mind sharing your email address with me ? If you don't mind I will email you the intermediateResultsUrl.

diekhans commented 3 years ago

markd@ucsc.edu;

hehuiying1125 notifications@github.com writes:

Hi , Thank you for your reply. I am willing to specified intermediateResultsUrl, do you mind sharing your email address with me ? I will email you the intermediateResultsUrl.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/ComparativeGenomicsToolkit/cactus/issues/419#issuecomment-784722450 Hi , Thank you for your reply. I am willing to specified intermediateResultsUrl, do you mind sharing your email address with me ? I will email you the intermediateResultsUrl.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.*