Open sameo opened 7 years ago
From @laijs on December 4, 2016 23:46
the virtio-serial is not package based transport, it seams hard to find the message header when cc-proxy re-connect to hyperstart.
Hi @dlespiau, are you doing anything related to this feature? If not then I'd like to hack on this if you don't mind.
Hi,
I'm doing the low level part of this, framing on top of the Host<->VM serial link so the proxy can recover the start of a frame when reconnecting to a running VM.
I haven't started on the task to save an on-disk state that the proxy can read from when starting again though. You could take that part.
Hi @dlespiau,
That's awesome! I've started experimenting exactly with on-disk re/store of the state as it's most obvious part for me. Okay. When I have something substantial to show I'll post here a WIP PR.
Cheers.
Thanks @dvoytik! Feel free to create an issue and assign to yourself (and maybe reference this issue) so it's clear to the whole team that that is something you're working on.
@jodh-intel, done. Although I can't assign it to myself.
@dvoytik - thanks - assigned.
@dlespiau any chance you have left some work in progress about the re-sync of a lost frame between proxy and VM serial port ?
Unfortunately, the work has been wiped out when I dd'ed /dev/urandom to my hard-drive :/
@dlespiau no worries, that's what I was expecting :p That's what you do when you move to something else !
@dlespiau BTW, we have a public IRC channel #clearcontainers on freenode. Come discuss about containers if you're interested ;)
@sboeuf - could you outline what you know about this problem?
@jodh-intel I'll go further, trying to cover all the cases, and how our components should be modified. The case is simple, we have Clear Containers running, meaning all components runtime/shim/proxy/VM(agent) are up and running. When the proxy crashes, we have shim/runtime/agent detecting the proxy disconnection while they are trying to communicate with.
Here what should do all the components upon this detection:
Shim
Agent
Runtime
Proxy
@sameo @grahamwhaley @jodh-intel I might have missed few corner cases, but I'd like to get your input on this. This is pretty important since we need to agree before we can open the corresponding issues and start the implementation.
Hi @sboeuf - thanks for this. If you don't mind, I'll merge the above with my notes and put it into a draft design (https://github.com/clearcontainers/runtime/issues/683) doc showing (a) what we have today and (b) what we want in the future...
@sboeuf - I've now raised a doc PR including your comments above:
@jodh-intel great thanks !
But I'd like to get some feedback about it too. Does that make sense for everyone ?
From @sameo on December 2, 2016 17:36
If cc-proxy crashes:
We need to work on:
Copied from original issue: 01org/cc-oci-runtime#505