BD2KGenomics / toil-scripts

Toil workflows for common genomic pipelines
Apache License 2.0
33 stars 18 forks source link

SE fastq quick fix version of rnaseq-cgl-pipeline fails #257

Closed dyollluap closed 8 years ago

dyollluap commented 8 years ago

The quick fix for single end RNA-seq is failing at the first job task before starting anything.

This is the complete stacktrace:

`INFO:toil.lib.bioio:Logging set at level: INFO INFO:toil.common:Using the mesos batch system I0430 16:51:45.571316 3396 sched.cpp:164] Version: 0.25.0 I0430 16:51:45.573550 3407 sched.cpp:262] New master detected at master@172.31.10.74:5050 I0430 16:51:45.573676 3407 sched.cpp:272] No credentials provided. Attempting to register without authentication I0430 16:51:45.574911 3406 sched.cpp:641] Framework registered with 34869b9a-b316-4edd-b12a-437cdb1b4b11-0001 INFO:toil.common:Written the environment for the jobs to the environment file INFO:toil.job:Downloading entire JobStore INFO:toil.job:0 jobs downloaded. INFO:toil.leader:(Re)building internal scheduler state INFO:toil.leader:Checked batch system has no running jobs and no updated jobs INFO:toil.leader:Found 1 jobs to start and 0 jobs with successors to run INFO:toil.leader:Starting the main loop INFO:toil.batchSystems.mesos.batchSystem:Preparing to launch Mesos task 0 using offer 34869b9a-b316-4edd-b12a-437cdb1b4b11-O26... INFO:toil.batchSystems.mesos.batchSystem:...launching Mesos task 0 WARNING:toil.leader:The jobWrapper seems to have left a log file, indicating failure: 25ca8025-786b-4bde-b47f-55012f76192f WARNING:toil.leader:Reporting file: 25ca8025-786b-4bde-b47f-55012f76192f WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: ---TOIL WORKER OUTPUT LOG--- WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: INFO:toil.job:LOG-TO-MASTER: Parsing input Samples WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: Traceback (most recent call last): WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/worker.py", line 273, in main WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: fileStore=fileStore) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1294, in execute WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: returnValues = self.run(fileStore) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1400, in run WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: rValue = userFunction(((self,) + tuple(self._args)), _self._kwargs) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/home/mesosbox/shared/toil-scripts/src/toil_scripts/rnaseq_cgl/rnaseq_cgl_pipeline.py", line 73, in parse_input_samples WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: if inputs.config: WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: AttributeError: 'dict' object has no attribute 'config' WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: Exiting the worker because of a failed jobWrapper on host ip-172-31-22-146 WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: ERROR:toil.worker:Exiting the worker because of a failed jobWrapper on host ip-172-31-22-146 WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: WARNING:toil.jobWrapper:Due to failure we are reducing the remaining retry count of job 25ca8025-786b-4bde-b47f-55012f76192f to 2 INFO:toil.batchSystems.mesos.batchSystem:Preparing to launch Mesos task 1 using offer 34869b9a-b316-4edd-b12a-437cdb1b4b11-O32... INFO:toil.batchSystems.mesos.batchSystem:...launching Mesos task 1 WARNING:toil.leader:The jobWrapper seems to have left a log file, indicating failure: 25ca8025-786b-4bde-b47f-55012f76192f WARNING:toil.leader:Reporting file: 25ca8025-786b-4bde-b47f-55012f76192f WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: ---TOIL WORKER OUTPUT LOG--- WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: INFO:toil.job:LOG-TO-MASTER: Parsing input Samples WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: Traceback (most recent call last): WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/worker.py", line 273, in main WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: fileStore=fileStore) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1294, in execute WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: returnValues = self.run(fileStore) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1400, in run WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: rValue = userFunction(((self,) + tuple(self._args)), _self.kwargs) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: rValue = userFunction(((self,) + tuple(self._args)), _self._kwargs) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/home/mesosbox/shared/toil-scripts/src/toil_scripts/rnaseq_cgl/rnaseq_cgl_pipeline.py", line 73, in parse_input_samples WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: if inputs.config: WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: AttributeError: 'dict' object has no attribute 'config' WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: Exiting the worker because of a failed jobWrapper on host ip-172-31-23-135 WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: ERROR:toil.worker:Exiting the worker because of a failed jobWrapper on host ip-172-31-23-135 WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: WARNING:toil.jobWrapper:Due to failure we are reducing the remaining retry count of job 25ca8025-786b-4bde-b47f-55012f76192f to 1 INFO:toil.batchSystems.mesos.batchSystem:Preparing to launch Mesos task 2 using offer 34869b9a-b316-4edd-b12a-437cdb1b4b11-O37... INFO:toil.batchSystems.mesos.batchSystem:...launching Mesos task 2 WARNING:toil.leader:The jobWrapper seems to have left a log file, indicating failure: 25ca8025-786b-4bde-b47f-55012f76192f WARNING:toil.leader:Reporting file: 25ca8025-786b-4bde-b47f-55012f76192f WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: ---TOIL WORKER OUTPUT LOG--- WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: INFO:toil.job:LOG-TO-MASTER: Parsing input Samples WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: Traceback (most recent call last): WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/worker.py", line 273, in main WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: fileStore=fileStore) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1294, in execute WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: returnValues = self.run(fileStore) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1400, in run WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: rValue = userFunction(((self,) + tuple(self._args)), _self._kwargs) WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: File "/home/mesosbox/shared/toil-scripts/src/toil_scripts/rnaseq_cgl/rnaseq_cgl_pipeline.py", line 73, in parse_input_samples WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: if inputs.config: WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: AttributeError: 'dict' object has no attribute 'config' WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: Exiting the worker because of a failed jobWrapper on host ip-172-31-23-135 WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: ERROR:toil.worker:Exiting the worker because of a failed jobWrapper on host ip-172-31-23-135 WARNING:toil.leader:25ca8025-786b-4bde-b47f-55012f76192f: WARNING:toil.jobWrapper:Due to failure we are reducing the remaining retry count of job 25ca8025-786b-4bde-b47f-55012f76192f to 0 WARNING:toil.leader:Job: 25ca8025-786b-4bde-b47f-55012f76192f is completely failed INFO:toil.leader:Only failed jobs and their dependents (1 total) are remaining, so exiting. INFO:toil.leader:Finished the main loop INFO:toil.leader:Waiting for stats and logging collator process to finish ... INFO:toil.leader:... finished collating stats and logs. Took 0.104547023773 seconds ERROR:toil.leader:Failed to unpickle root job return value Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/toil/leader.py", line 525, in mainLoop rootJobReturnValue = cPickle.load(fH) EOFError INFO:toil.batchSystems.mesos.batchSystem:Stopping Mesos driver I0430 16:52:04.470610 3396 sched.cpp:1771] Asked to stop the driver I0430 16:52:04.470681 3405 sched.cpp:1040] Stopping framework '34869b9a-b316-4edd-b12a-437cdb1b4b11-0001' INFO:toil.batchSystems.mesos.batchSystem:Joining Mesos driver INFO:toil.batchSystems.mesos.batchSystem:Joined Mesos driver Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/mesosbox/shared/toil-scripts/src/toil_scripts/rnaseq_cgl/rnaseq_cgl_pipeline.py", line 989, in main() File "/home/mesosbox/shared/toil-scripts/src/toil_scripts/rnaseq_cgl/rnaseq_cgl_pipeline.py", line 985, in main Job.Runner.startToil(Job.wrapJobFn(parse_input_samples, inputs), args) File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 445, in startToil return mainLoop(config, batchSystem, jobStore, rootJob, jobCache=jobCache) File "/usr/local/lib/python2.7/dist-packages/toil/leader.py", line 528, in mainLoop raise FailedJobsException(jobStoreFileID, totalFailedJobs) toil.leader.FailedJobsException: The job store 'e0fd2228-520f-4119-b463-331abc8cf5db' contains 1 failed jobs Exception in thread Thread-14: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in bootstrap_inner self.run() File "/usr/local/lib/python2.7/dist-packages/bd2k/util/threading.py", line 38, in run super( ExceptionalThread, self ).run( ) File "/usr/lib/python2.7/threading.py", line 754, in run self.__target(_self.args, *_self.__kwargs) File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 758, in writer assert False AssertionError

Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored

hannes-ucsc commented 8 years ago

A communication between @jvivian and @dyollluap on Slack—for future reference, please keep the communication in the ticket GH or summarize out-of-band communication in the ticket—brought to light that the cause for the exception was an illegally named file in S3.

In all new code, we need to make sure that asserts aren't used to validate user input. If the user made a mistake, an explicit exception needs to be raised.