Closed zaibon closed 5 years ago
I think this approach is gonna be a bit more complex than this. One of the important responsibilities of PID 1 is to consume exit codes of orphan processes. which means on restart of core0 process, PID 1 will be the direct parent of all the system services (redis, libvirtd, etc...) which means it will also need to make sure these processes are working properly and handle there logs, etc ...
So a slight change on your suggestion, i suggest we support 2 modes for core0
If an update is available, the core0 binary is gonna be redownloaded, and the core0 api process is restarted to run new api code.
We will still have a problem with containers thought that we need to figure out, since the communication between core0 and coreX processes are done via pipes (man 2 pipe) that are only available in the kernel, if one end of the pipe (core0, is closed) coreX will faile with (Borken Pipe) error. We can try named pipes although this might not be possible since coreX runns in a different mount namespace, so i will need to experment a little bit with this.
containerd runs a shim in the docker-ce versions that maintain container sockets through which they can communicate, ans as such you can restart dockerd (core0 in our pov) whithout losing the containers in the process
nvm... i'm full of crap... restarting docker stops the containers too
There are so many points that we need to discuss regarding this issue.To be honest, i thought the update will be a v2 features, hence the work on this for v1.5 was discontinued.
Anyway, we need to discuss how this should be really done in v1.5, we need to take care the following points:
Conclusion: Splitting core0 like that is not gonna be a trivial task, i think it would be better if we start on v2 immediately now, since lots of parts gonna be rewritten anyway!
One of the important responsibilities of PID 1 is to consume exit codes of orphan processes. which means on restart of core0 process, PID 1 will be the direct parent of all the system services (redis, libvirtd, etc...) which means it will also need to make sure these processes are working properly and handle there logs, etc ...
I don't see problem about that, as soon as PID 1 knows whatr to do, it could take care of management, which is already the case for core0 itself btw (ensure core0 restart if it crash).
We will still have a problem with containers thought that we need to figure out, since the communication between core0 and coreX processes are done via pipes (man 2 pipe) that are only available in the kernel, if one end of the pipe (core0, is closed) coreX will faile with (Borken Pipe) error. We can try named pipes although this might not be possible since coreX runns in a different mount namespace, so i will need to experment a little bit with this.
Maybe we can use unix socket instead of pipes for that ?
Processes, instead of being attached directly to core0 they can be instead piped to a helper (a lightweight helper for each process) that do processing on the data and then pipe the streams directly to redis, then core0 can be just a reader on a redis queue to process the streams and receive the messages. This way child processes, or services don't have to be even a child of core0, they can just be direct children of PID 1, and core0 can restart independently.
I don't think starting a helper (I read process) per new process will be good in a resources point of view :/
@maxux
I don't see problem about that, as soon as PID 1 knows whatr to do, it could take care of management, which is already the case for core0 itself btw (ensure core0 restart if it crash).
Yeah, i am not saying there is a problem, I am just clearing that PID 1 need to do what core0 doesn now to bootstrap the system and start system services, and provide monitoring for those services. It means we will still run core0 as PID 1 in a (init) mode, and then start another service to handle connections and API calls.
Maybe we can use unix socket instead of pipes for that ?
Unix sockets were used in the very early version, but I dropped it in fever of pipes because coreX (container) starts in a different mount namespace, so i had to mount-bind the unix socket inside the container mount namespace, it also meant other processes inside the container can see and connect to this socket. Pipes on the other hand are direct to the process, and can't be intercepted.
If we gonna use unix sockets again, we have to make sure it's very secured in a way malicious processes inside the container can't use it.
I don't think starting a helper (I read process) per new process will be good in a resources point of view :/
Yes, i totally agree on that, but we need to make sure processes outputs are processed regarding if u holding the process streams or not. We can put more thought into it. I am trying to avoid writing this to a file. may be we can use syslogd ? or something similar.
In PID1 (core0)
There parts of the api doesn't keep in memory state, so they can be reloaded if needed
there components keep an in memory status, it requires work (not a quick task) to drop the state or move it to disk
won't fix in this version
The goal is to be able to update 0-core without the need of a reboot. Idea how to get there was: