paris-saclay-cds / ramp-board

RAMP packages: database, backend, frontend, utilities
https://paris-saclay-cds.github.io/ramp-docs/
BSD 3-Clause "New" or "Revised" License
14 stars 17 forks source link

Add a remote worker #444

Open rth opened 4 years ago

rth commented 4 years ago

Currently there is a local CondaEnvWorker and a remote AWSWorker. I would be usful to have a generic RemoteWorker that's not tied to EC2 API. The idea would be to make submissions to a remote server that's already running.

This could likely be done either,

  1. using dask distributed with 1 remote dask worker
  2. or possibly something like celery + rabbitmq for communication (though this would likely be more difficult to setup than the first option).
agramfort commented 4 years ago

+1 for trying with dask

kegl commented 4 years ago

+1 also for dask. We had an early celery-based attempt that we ended up scraping. it was hard to manage when something went wrong.

glemaitre commented 4 years ago

I think this is a great idea. I think that we wanted to try something with OpenStack as well at some point. I think this is only by implementing a dask (or other) worker that we can see how much abstraction we can have in RemoteWorker. BaseWorker already provides the protocol but I don't know how much more the RemoteWorker can do.

@rth do you have some insights?

rth commented 3 years ago

So there is an initial implementation with a dask.distributed worker in https://github.com/paris-saclay-cds/ramp-board/pull/452. It passes all worker and dispatcher tests with a local dask cluster and most tests with a remote dask cluster (assuming the file paths for ramp kit, data etc are the same on the remote server). A few issues still need to be ironed out, but it gives a general idea. The code structure is very similar to a local conda worker.