galaxyproject / ephemeris

Library for managing Galaxy plugins - tools, index data, and workflows.
https://ephemeris.readthedocs.org/
Other
27 stars 38 forks source link

Difficult to debug run_data_managers #107

Closed carrieganote closed 6 years ago

carrieganote commented 6 years ago

When running the data manager from ephemeris, there is no way to see what went wrong unless you tell galaxy to turn off cleanup and go digging in the job working directory. This is very awkward to debug if it is being run during a docker build, say, after the tools install. Here is what one gets in the case of any error:

galaxy.jobs.output_checker DEBUG 2018-08-15 19:49:06,367 Tool produced standard error failing job - [Traceback (most recent call last):
  File "/shed_tools/toolshed.g2.bx.psu.edu/repos/trinity_ctat/ctat_genome_resource_libs_data_manager/ea7bc21cbb7a/ctat_genome_resource_libs_data_manager/data_manager/add_ctat_resource_lib.py", line 879, in <module> # this is the line that says main(), not helpful in this case..
]
galaxy.jobs DEBUG 2018-08-15 19:49:06,526 (2) setting dataset 2 state to ERROR
galaxy.jobs INFO 2018-08-15 19:49:06,636 Collecting metrics for Job 2
galaxy.jobs DEBUG 2018-08-15 19:49:06,664 job 2 ended (finish() executed in (162.449 ms))
galaxy.tools.error_reports DEBUG 2018-08-15 19:49:06,674 Bug report plugin <galaxy.tools.error_reports.plugins.sentry.SentryPlugin object at 0x7f1a1e525190> generated response None

==> /home/galaxy/logs/slurmd.log <==
[2018-08-15T19:49:06.299] [3] sending REQUEST_COMPLETE_BATCH_SCRIPT, error:0
[2018-08-15T19:49:06.306] [3] done with job
Job 2 finished with state error.
Not all jobs successful! aborting...
Traceback (most recent call last):
  File "/usr/local/bin/run-data-managers", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/ephemeris/run_data_managers.py", line 294, in main
    data_managers.run(log, args.ignore_errors, args.overwrite)
  File "/usr/local/lib/python2.7/dist-packages/ephemeris/run_data_managers.py", line 255, in run
    run_jobs(self.fetch_jobs, self.skipped_fetch_jobs)
  File "/usr/local/lib/python2.7/dist-packages/ephemeris/run_data_managers.py", line 248, in run_jobs
    raise Exception('Not all jobs successful! aborting...')
Exception: Not all jobs successful! aborting...

Is there a way to display the contents of the galaxy_#.e file, perhaps?

rhpvorderman commented 6 years ago

This is a great suggestion! And furthermore: I checked the bioblend API and this functionality is present! I have run into the same problem, but I just reran jobs while checking the job working directory. This is much more elegant. I will start working on this.