it was listening on the same port as the container which made things
way more difficult than they need to be. Even with the "network
locking" there were cases where clients would get a connection refused
as at some point the socket needs to be closed/reopened.
the network lock had another side effect that it would trigger TCP
retransmits which delayed some requests by a whole second.
While the new activator is not fully realised in eBPF, it's way more
reliable as we can simply steer traffic without any interruptions just
with a few maps. Essentially activation now works like this:
container is in checkpointed state.
incoming packet destined to container.
eBPF program redirects packet to userspace TCP proxy listening on
random free port.
proxy accepts TCP session and triggers restore of container.
proxy connects to container as soon as it's running.
proxy shuffles data back and forth for this TCP session and all other
connections that were established while the container was restoring.
write to eBPF map to indicate it no longer needs to redirect to
proxy.
traffic flows to container directly as usual without going through the
proxy for as long as it's alive.
on checkpoint the redirect is enabled again.
It still only needs to proxy the requests during restore while having a
more reliable activator that never drops a packet. The current
implementation is using TC as it allows to modify ingress and egress
packets. A full eBPF solution has been experimented with but the main
issue is that we need to "hold back" packets while the container is
being restored without dropping them. As soon as the initial TCP SYN is
dropped, the client will wait 1 second for retransmitting and make
everything quite slow. I was unable to find a solution for this as of
now so instead the userspace proxy is still required.
The old activator had several problems:
While the new activator is not fully realised in eBPF, it's way more reliable as we can simply steer traffic without any interruptions just with a few maps. Essentially activation now works like this:
It still only needs to proxy the requests during restore while having a more reliable activator that never drops a packet. The current implementation is using TC as it allows to modify ingress and egress packets. A full eBPF solution has been experimented with but the main issue is that we need to "hold back" packets while the container is being restored without dropping them. As soon as the initial TCP SYN is dropped, the client will wait 1 second for retransmitting and make everything quite slow. I was unable to find a solution for this as of now so instead the userspace proxy is still required.