bartongroup / slivka

http://bartongroup.github.io/slivka/
Apache License 2.0
7 stars 3 forks source link

Slow jobs submission via GridEngineRunner #56

Closed warownia1 closed 5 years ago

warownia1 commented 5 years ago

When the job is submitted to Univa Grid Engine, the submit function spawns a new qsub process and waits for it's completion to fetch the job id. process.communicate() blocks the thread for quite significant amount of time.

Possible solution is to build the scheduler on top of the gevent library and submit jobs asynchronously.

warownia1 commented 5 years ago

http://www.gevent.org/api/gevent.subprocess.html

warownia1 commented 5 years ago

A simple runner which submits asynchronously

import asyncio
import random

class Runner:
  def __init__(self):
    self.loop = asyncio.new_event_loop()

  async def _async_submit(self, semaphore):
    async with semaphore:
      proc = await asyncio.create_subprocess_shell('sleep 1', loop=self.loop)
      stdout, stderr = await proc.communicate()
      # process output
      return None if proc.pid == 0 else RuntimeError("Failed to submit")

  def batch_submit(self, num_jobs):
    semaphore = asyncio.BoundedSemaphore(100, loop=self.loop)
    asyncio.get_child_watcher().attach_loop(self.loop)
    res = self.loop.run_until_complete(asyncio.gather(
      *[self._async_submit(semaphore) for _ in range(num_jobs)],
      loop=self.loop,
      return_exceptions=False
    ))
    return res

r = Runner()
warownia1 commented 5 years ago

Fixed by 10104c5cae5dbe80522040ae6a47ba977048c850