Open ThomasThelen opened 6 years ago
@hategan PTAL, this is relevant to the task you're currently working on. It'd be great if you could brainstorm that together.
There is that model that takes everything that is potentially long-running and makes it asynchronous and there's a nice task list somewhere, an icon with the active/failed/completed tasks, etc.
Do we have an infrastructure for this?
If you're referring to the image below (the job watcher
), it's only interface-able by using jobs, so any code in girder-wholetale
isn't compatible with it (instead it uses the notification stream (at the bottom of this message)).
I wrote the publishing code in girder-wholetale
but I'm in the process of porting it to gwvolman
so that we can show its progress in the job watcher. As I'm doing this I'm thinking about how registration is also in girder-wholetale
, but I'm considering moving it into gwvolman
and run it as a job. It would give the user a central place to check the status of their tasks, and tale importing will probably use it.
This would probably need to get an okay from the PI team, @mbjones might have some input on this.
By "do we have an infrastructure for this?" I mean do we have some plugin/library endowed with some reasonable user interface that can be used from any other plugin to wrap some long-running task?
I'm assuming "jobs" is a girder plugin. Can it be used as a dependency from girder-wholetale or does it require that the relevant code be moved to another plugin? If the latter, then it probably doesn't fit my idea of proper infrastructure.
@hategan we do. jobs
is already a dependency for girder_wholetale
. We're using that infra for building docker images, creating/destroying instances.
To be exact: jobs
is rather abstract. There's a particular implementation using celery (plugin: worker
) that ties into that abstract interface and implements actual functionality.
To piggy back off of @Xarthisius, an example is in server.models.instancy.py
where we
from gwvolman.tasks import create_volume, launch_container
and then
# Create single job
volumeTask = create_volume.signature(args=[payload])
serviceTask = launch_container.signature(queue='manager')
(volumeTask | serviceTask).apply_async()
Sorry for dragging this, but the part I don't understand then is "so any code in girder-wholetale isn't compatible with it".
@ThomasThelen "jobs" in that example are hidden. I think it's better to look at
https://github.com/whole-tale/girder_wholetale/blob/master/server/rest/image.py#L287-L307
I should have more clear about what I meant by that. We can spawn the job from girder-wholetale, but the code that executes within the job must reside in the job plugin-in this case gwvolman
(hence me porting the publishing stuff out). Unless I'm mistaken.
Technically it can be anywhere, it's just a matter of installing that code in a place where celery can access it. Here's an example of python package that's actually a combination of 1) girder server extension, 2) girder ui extension and 3) a celery task:
https://github.com/kotfic/gwpca
If you think that makes more sense, we could talk about incorporating gwvolman into girder_ythub.
Are we planning to use the distributed aspect of celery or should we have a local, thread pool based implementation of jobs? I'm thinking the latter would allow us to pass objects/lambdas around, make it easier to keep code where it belongs, and, last but surely not least, make debugging significantly easier.
I don't think we were planning it, simply because we didn't have a need for that. While I agree with the advantages that you've mentioned, I'd rather avoid putting more computational burden on the girder side. It already needs to deal with data management, transfers etc. (wt_data_manager is threaded already right?)
Yes. It's I/O bound though.
If celery is running on a different machine?
Yup, it's running everywhere, except on the machine that has i/o storage and hosts girder.
That makes sense. I'm inclined to do a local implementation for the first iteration to limit the scope and switch to jobs once that works.
That sounds good to me-I did it that way too :)
Right now we're using the notification stream while registering data. This works fine, but for large datasets (https://search.dataone.org/view/doi:10.18739/A2NK36467) it's hard to see the progress. In addition, there isn't a way to check if a dataset is being registered which we might want to do (should we let a user launch/publish a tale while their data is still being imported?)
In addition, I think that Tale importing should also be a job. We're going to want to parse the EML metadata when it comes in, potentially create new items (with descriptions/names taken from the metadata), we may also need to build a new image and recipe if the image doesn't exist.