jlewi / flaap

Federated Learning and Analytics Protocols
Apache License 2.0
0 stars 0 forks source link

Garbage collect tasks and delete values in remote workers #24

Open jlewi opened 2 years ago

jlewi commented 2 years ago

Right now there is no garbage collection of tasks or values stored in remote executors. We should support that.

RemoteExecutor handles this by having a finalizer on RemoteValue. When there are no more references to the value a Dispose request is issued to the remote executor.

We should probably follow the finalizer pattern as implemented for K8s. When a TaskValue is no longer referenced we should mark the task as eligible for deletion. However, the finalizer should prevent it from being deleted until the worker has deleted its value.

We will need to handle the case where a worker never calls back in. We should probably have some sort of heartbeat mechanism for a worker and fail/cleanup any tasks for a worker which hasn't checked in in some amount of time.