balena-os / balena-engine

Moby-based Container Engine for Embedded, IoT, and Edge uses
https://www.balena.io
Apache License 2.0
695 stars 66 forks source link

Engine socket on container becomes unusable after Engine crash #282

Open lmbarros opened 2 years ago

lmbarros commented 2 years ago

If we start a container with the label io.balena.features.balena-socket: '1' set, this container will have access to the Engine socket. However, if the Engine crashes on the Host OS, that container will no longer be able to connect to the Engine (even after the Engine restarts on the HostOS). Attempting to run Docker on the container will fail with

Cannot connect to the Docker daemon at unix:///host/run/balena-engine.sock. Is the docker daemon running?

This can be easily reproduced by SIGKILLing balenad on the Host OS and then trying to run Docker or balenaEngine on a container where it was previously working.

This is arguably on the border between the Supervisor (that sets the mounts and shares up) and the Engine (that implements the mechanisms).

jellyfish-bot commented 2 years ago

[lmbarros] This issue has attached support thread https://jel.ly.fish/41b56e32-5fae-4a2e-b5bb-05f9f5af1f0f

deanMike commented 2 years ago

I have an example of this issue here: https://github.com/machinemetrics/docker-socket

cywang117 commented 2 years ago

Another repro courtesy of @lmbarros: https://github.com/balena-io-playground/engine-on-container-socket-lost-test

lmbarros commented 2 years ago

Did a couple more quick tests:

klutchell commented 2 years ago

I suspect this would be resolved by https://github.com/balena-os/balena-supervisor/pull/1780

deanMike commented 2 years ago

I suspect this would be resolved by balena-os/balena-supervisor#1780

@klutchell Do you know if there's still a plan to get that fix in? If there's any way me and my team could help test this out this issue has been a real thorn in our side

klutchell commented 2 years ago

Hey @deanMike, I have requested updates on the linked PR: https://github.com/balena-os/balena-supervisor/pull/1780