arriven / db1000n

MIT License
1.17k stars 200 forks source link

Main db1000n network is unreachable when ovpn container is restarted #524

Open palianycia123 opened 2 years ago

palianycia123 commented 2 years ago

I noticed that sometimes ovpn (service of docker-compose app) is restarted - which is ok because there might be some connectivity issue to VPN server. Autoheal container works perfectly here - it restarts ovpn container. But it doesn't restart main db1000n container because it doesn't have health check endpoint. It results in false positive of db1000n status - container is up and running, but packets are not transmitting.

Expected Behavior

It will be good to add health check end point in main db1000n program which will do e.g.: nslookup google.com and return 200 on success and non-200 on failure. Thus we will be confident in network setup by calling health check endpoint periodically via docker-compose.

Actual Behavior

Network is unreachable in main db1000n container when ovpn container is restarted.

Steps to Reproduce the Problem

  1. docker-compose -f examples/docker/static-docker-compose.yml up -d (to see network is unreachable error in main db1000n container, add LOG_LEVEL:DEBUG environment in static-docker-compose.yml)
  2. Turn host network off - to simulate network connectivity issue.
  3. Observe ovpn container logs and wait until container is restarted. docker logs docker_ovpn_1 -f
    2022-05-02 08:45:43 SIGTERM[hard,] received, process exiting
    Exiting.
  4. Turn host network on.
  5. Observe that ovpn container logs shows it is connected successfully.
    2022-05-02 13:45:07 Initialization Sequence Completed
  6. Observe main program logs
    error sending packet    {"error": "dial tcp 146.120.90.38:80: connect: network is unreachable", "args": {"address":"146.120.90.38:80","body":"{{ random_payload 10 }}","connection":{"args":{"address":"146.120.90.38:80","protocol":"tcp","proxy_urls":"","timeout":null},"type":"net"},"interval_ms":1,"packet":{"payload":{"data":{"payload":"{{ random_payload 10 }}"},"type":"raw"}}}}
    error sending packet    {"error": "dial tcp 146.120.90.247:443: connect: network is unreachable", "args": {"address":"146.120.90.247:443","body":"{{ random_payload 10 }}","connection":{"args":{"address":"146.120.90.247:443","protocol":"tcp","proxy_urls":"","timeout":null},"type":"net"},"interval_ms":1,"packet":{"payload":{"data":{"payload":"{{ random_payload 10 }}"},"type":"raw"}}}}
    error sending packet    {"error": "dial tcp 146.120.90.42:8080: connect: network is unreachable", "args": {"address":"146.120.90.42:8080","body":"{{ random_payload 10 }}","connection":{"args":{"address":"146.120.90.42:8080","protocol":"tcp","proxy_urls":"","timeout":null},"type":"net"},

Note: When I restart main db1000n container manually it resolves network issue.

Specifications

arriven commented 2 years ago

I'm considering different ways to do it but it feels like this and #525 could be implemented in the same way. Or rather implementing that one would make it very easy to implement this one

palianycia123 commented 2 years ago

Well, simple app crashing/exiting in case of network issues, might not resolve this issue. According to autheal doc, health check is the dependecy for autoheal container. Note: You must apply HEALTHCHECK to your docker images first

arriven commented 2 years ago

I'm not completely sure but I assume autoheal would just restart the container if healthcheck fails. Crashing the main process would make the container restart anyway if correct restart policy is provided