runabol / tork

A distributed workflow engine
https://tork.run
MIT License
599 stars 40 forks source link

Custom Mounter for docker #370

Closed markdessain closed 6 months ago

markdessain commented 7 months ago

I was looking at creating a custom mounter for a docker worker.

I already have a volume created with docker volume

docker volume ls
DRIVER    VOLUME NAME
local     abc_example_data
...

And I wanted to be able to attach that to a task:

...
mounts:
    - type: my_custom_mount
      source: abc_example_data
      target: /abc/data
...

But I came across this bit of code which seems to check that the mounter type is either: tork.MountTypeVolume, tork.MountTypeBind or tork.MountTypeTmpfs. All other custom types will raise an error: unknown mount type.

https://github.com/runabol/tork/blob/4319d73113230a79fb110c49abf1c87ff0324b17/runtime/docker/docker.go#L180-L208

Am I correct in saying that this switch logic would need to be updated to allow custom mounts for docker workers?

Thanks for Tork btw 😃, I really like the way it is possible to extend each component.

markdessain commented 7 months ago

My work around for my current situation is to just use the bind and point to where docker volume has created the data on disk:

    mounts:
      - type: bind
        source: /var/lib/docker/volumes/abc_example_data/_data
        target: /abc/data
runabol commented 7 months ago

Interesting. My initial instinct is that we would need to extend the Volume Mounter to support the use case of a pre-existing volume.

runabol commented 7 months ago

I'm guessing you don't want Tork in this case to manage the volume's lifecycle for you? I.e create an ephemeral one for the duration of the task and then remove it (the default functionality of the volume mounter)

markdessain commented 7 months ago

I get your logic and it makes sense to not have to handle the lifecycle like in your ephemeral examples such as with the image resizing.

My setup could probably be more optimised and I should probably have a more permant s3-like storage but since it's just me and I want to minimise services I need to keep running. Prior to using Tork I had multiple jobs running at different intervals which generate multiple parquet files to local disk. Then a UI which loads the volumes in readonly mode to display some summary.


My use case aside my main observation was that its not possible to create a custom mounter for docker as it will always error with:

https://github.com/runabol/tork/blob/4319d73113230a79fb110c49abf1c87ff0324b17/runtime/docker/docker.go#L198-L199

runabol commented 7 months ago

Not documented yet but definitely possible:

https://github.com/runabol/tork/blob/4319d73113230a79fb110c49abf1c87ff0324b17/engine/default.go#L36

Also see: https://www.tork.run/customize