Open RobbeSneyders opened 1 year ago
Is this more of an issue for the remote runners (vertex, kfp) rather than the local one?
I think for the local runner pulling and running images takes minimal amount of time (once the large image is downloaded once) because of caching
For the remote one:
We also saw that pulling it once locally can take a long time, especially if you need to pull each image for a pipeline. Some images are >3GB.
Can we build two versions of the cuda images? One image with cuda and one without cuda support? We could use the non-cuda images for local testing. When a gpu is available, we could pull the image with cuda support.
@RobbeSneyders is that the case of all the images share the same base image (cuda/pytorch)? In that case, the large image will only have to be pulled once and all the layers on top of it (additional dependencies) should be lightweight.
@mrchtr that could work but it has to be estimated at compile time, we could do it based on the GPU config in the component Op. however, I'm not sure if that's the problem we're trying to solve (lightweight testing). I think we also want fast pulling when running with GPU
Building base Fondant images might make sense as well since installing Fondant adds 377MB to the docker image due to our dependencies.
We should investigate if the Fondant install layer differs between images if they're built separately, even if their Dockerfile is the same (the current case). If they're pulled separately for each image, a Fondant base image would help here by only having this layer (and other shared layers) pulled once.
Some of our reusable component images are quite large, especially anything cuda related, which leads to long download / startup times. We should try to minimize the images of our reusable components.