jepsen-io / jepsen

A framework for distributed systems verification, with fault injection
6.78k stars 714 forks source link

Fix clock changes in latest Jepsen #569

Closed deriamis closed 1 year ago

deriamis commented 1 year ago

This partially reverts a previous commit, which added privileged: true and capabilities: ALL to Docker test nodes. It also adds mounts which should ensure the containers come up with the same local time as the host.

Closes #568

aphyr commented 1 year ago

Hmm. @dancmeyers, since you added some of these parameters specifically to fix earlier broken-ness in Docker, do you want to weigh in here? I don't want to flip back and forth between two broken states if we can avoid it.

deriamis commented 1 year ago

@aphyr By the way, this PR is just what we applied locally to prevent clock skew in the test nodes from propagating to the host. We had the clock skew nemesis disabled in our tests, so we don't know why it was happening. If someone can figure that out and prevent it, I think it would be a better fix than this.

nurturenature commented 1 year ago

The current flipped and flopped values for:

volumes:
  # no /sys/fs/cgroup
privileged: true
cap_add:
  - ALL

Evolved to that state:

The docker-compose regression has been fixed but not yet released by docker.

In general, Jepsen should be able to follow docker-debian-base:

# For a host running bullseye, or a newer cgroups and systemd, you have to use this:

docker run -td --stop-signal=SIGRTMIN+3 \
  --tmpfs /run:size=100M --tmpfs /run/lock:size=100M \
  -v /sys/fs/cgroup:/sys/fs/cgroup:rw --cgroupns=host \
  --name=name jgoerzen/debian-base-whatever

In the past, the proposed changes to:

volumes:
  - "/sys/fs/cgroup:/sys/fs/cgroup:ro"
cap_add:
  - NET_ADMIN

Have broken some environments.

And:

I don't want to flip back and forth between two broken states if we can avoid it

Seems to have been true for some time.


As for the container clock skew you are seeing, what a good reminder that clock skew is real and can be consequential!

Would it be too simplistic to defensively?

(jepsen.os/setup!)

; Try to stop ntpd service in case it is present and running.
(try
  (c/su
    (c/exec :service :ntpd :stop))
  (catch RuntimeException e))
dancmeyers commented 1 year ago

I have to admit it's been so long since I touched this that I have nothing useful to add right now. Although we do have some more Jepsen work upcoming, so hopefully it doesn't bite me again then ;)

aphyr commented 1 year ago

So--it sounds like I should close this PR then, to avoid re-breaking Docker in other environments? This may be one of those things where a fork is preferable...

nurturenature commented 1 year ago

At the current time, I don't think it's possible to have a compose.yml that is cross os, recent docker versions, common hosting configs, for os style containers like Jepsen uses that works for all.

So, would recommend being conservative for now, leaving /sys/fs/cgroup, privileged, cap_add and then adapting to new docker versions as they are released.

deriamis commented 1 year ago

@aphyr We have an internal fork already, so it's not really a problem for us anymore. We only use the Docker tests for list-append, and we don't enable the clock skew nemesis for that. Our main worry is that users will run Jepsen tests and see erroneous failures.

@nurturenature To be clear, the clock skew we're seeing isn't causing a real failure - it's causing the test setup to fail inside the node container. There is no NTP client running inside the test node to stop. There is one running outside the test node on the host. What we're trying to show is that Jepsen will fail on setup in some environments. For us, the problem is solved via a fork.

One thing I want to note is that we were a little confused as to why the test nodes need privileged capabilities. It's possible to only change the clock inside a Docker container without affecting the host as long as the container isn't privileged and/or doesn't have the SYS_ADMIN or SYS_TIME capabilities. Is there a reason the host system's clock needs to be changed?

Adding @IamXander in case he has any other observations.

nurturenature commented 1 year ago

Docker released new versions earlier this week that better support OS/systemd containers like Jepsen uses for nodes.

It's now possible to use unprivileged containers. See Jepsen PR #570. It should address this issue too.

P.S. :+1: for a CI with Jepsen.