Open fvoznika opened 4 years ago
IIRC, our partners at ANT were working on this, but I can't recall who. @tanjianfeng?
Jianfeng is not working on this right now, I and some others will be working on this. The problem is, I'm new to both golang and gVisor so any help is appreciated. From my limited understanding, I agree with item 1 and 4, i.e. save/restore has to happen for the entire sandbox(pod). Right now I don't quite understand item 2 and 3, I'll need to learn more about how gVisor works.
Hi good news, we have kind of made multicontainers restore work, but the patches are in preliminary stage and we are still working on testing and cleaning things up. Will send them out once we have more confidence on it, but let me know if you are interested in an early stage review, thanks.
That's very cool!! I'm interested in early stages review if you can point me to the right direction.
Bear me some time to do rebase/refactor, expect a git branch ready for early stage review sometime next week, thanks!
The branch is ready at: https://github.com/aaronlu/gvisor multicontainer Feel free to let me know what you think, thanks.
It passed a simple test of: starting 3 bash containers(one root container + two child containers) and then restoring them. After restore, all three containers can accpet command and runsc can send signal to pid 1 of each child container and the child containers are destroyed after the signal.
@fvoznika Did you ever get a chance to look at @aaronlu's branch? It's a few (...thousand) commits behind master
now, but this use case is super interesting to me so I could potentially revive (not sure if a single PR makes sense since it's so big?) it if that looks like a halfway decent approach.
Checkpoint/Restore (aka Save/Restore) is only supported for a single container.
There are a few things required to enable multi-container from the top of my head: