Closed tasdomas closed 1 year ago
This code does not guarantee that the Deployment is going to be scheduled to the same node as the subsequent Job, effectively invalidating the use of access modes other than ReadWriteMany.
Major blunder: as @tasdomas pointed out in a separate conversation, the previous implementation had exactly the same issue. 🙈
Instead of separating data transfer from the main job, we should consider using a sidecar or, probably better yet, an init container as part of the main job, with the sole purpose of performing data transfer. What do you think?
Won't the init container be started for each job pod though?
Yes, although that's a feature rather than a bug. 😄 We can use the init container for synchronization when parallelism is greater than 1. I.e. use the first init container for the actual data synchronization process, and use the others just to wait until the data copy finishes.
There's still an issue, though. Copying the results back once the job finishes running still requires spinning up again a Job or a Deployment. 🤦🏼♂️
Instead of reusing (abusing) the original job, launch a separate deployment with a busybox pod and minimal requirements to facilitate data transfer.
This addresses #647 and #648