alexschroeter / apptainer-deployer

0 stars 1 forks source link

Stopping a pod is not possible from the app #1

Open alexschroeter opened 3 days ago

alexschroeter commented 3 days ago

Since there is no status of an instance and also no way to find out if there was a container instance running, I don't have a way to have a good way to set old Pods to "Stopped".

jhnnsrs commented 5 hours ago

hmm, apptainer seems to really suck? AFAIK i though however that it would run the app just as a process with a process_id, isn't it pssible to inspect these process_ids periodically and check their exit codes / if theiy are still running?

alexschroeter commented 1 hour ago

Maybe I was too casual in my description or there is some functionality that is currently not working as expected (like heartbeat see image all pods are running). When I deploy multiple Pods with apptainer-deployer ofc as long as the deployer runs I have the information on which apptainer instances I started and are running. If I don't see an apptainer instance from the list of started apptainer instances in the apptainer instance list (example below) anymore, I can set the pod to stop.

image

As far as I can tell this is the only list of apptainer instances I have which only includes those running and there is no list of finished instances.

aschroeter@Watson:~$ apptainer instance list
INSTANCE NAME                                    PID     IP    IMAGE
arkitekt-a9cfc161-fd07-416c-8ce7-78d54e574a86    1350          /tmp/rootfs-3575835608/root
arkitekt-b15929d7-ebb0-406c-bfb2-9452211a79f3    1065          /tmp/rootfs-1038182467/root
arkitekt-e0b7a4ac-2efd-40db-85f0-42b9af5c873d    778           /tmp/rootfs-2906462683/root

But I don't think I can guarantee that I set an instance to stopped before apptainer-deployer is stopped (and I also believe I shouldn't since we would want to be able to reconnect again?). So after that I am only able to see the currently running instances from apptainer instance list. The only way to mark those pods as stopped would be if the pods ask for a sort of heartbeat or if the checker has access to the list of pods which are currently active and running on this node, to compare with the apptainer instance list which might still give not the correct state if there is also a docker container running which wouldn't show up in the apptainer instance list and therefore be set to stopped although it might be running as docker.