alexschroeter / apptainer-deployer

0 stars 1 forks source link

How to handle GPU flavor settings #2

Open alexschroeter opened 3 days ago

alexschroeter commented 3 days ago

I am unsure what is the best way to translate requirements into settings.

If we have a requirement of gpu.amd this means we need to add the flag to --roccm to the start for a simple setup. This translation can either happen on an "Arkitekt Level" which would allow:

  1. Updates to the Interface to easily propagate without needing to update all Apps
  2. Settings would be consistent throughout Arkitekt

But to allow for exceptions (maybe one needs more fine-grained control over the settings) some overwrite mechanism which would allow overwriting the Arkitekt default would be nice.

jhnnsrs commented 3 hours ago

Yes this is a bit of a tricky issue, i was hoping there was an open-standard for "node-selectors"/"node-affinity": https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/, but i couldn't really find one. Maybe some preliminary selectors would be a great idea: similar to how requirements for services are now implemented. Here was a draft for this https://github.com/jhnnsrs/arkitekt_next/blob/main/arkitekt_next/cli/types.py

Contrary to what is outlined as a "build_docker_params" there, i don't believe this should be handled by the library itself but should be handled by the engine , i.e this app, trying to inspect the selectors and choose which params to pass. this would allow us to be backwards compatibly with different version of the docker, apptainer api (because these fuckers change all the time :D). What do you think?

Which could be translated to the underlying engine . I imagine