Closed cockroach-teamcity closed 6 years ago
These are both "silent too long" failures. In both cases, the jepsen process is waiting here in the setup phase.
Logs before this point:
WARN [2017-12-26 12:44:22,569] jepsen node 35.196.138.29 - jepsen.control.util DEPRECATED: jepsen.control.util/install-tarball! is now named jepsen.control.util/install-archive!, and the `node` argument is no longer required.
WARN [2017-12-26 12:44:22,569] jepsen node 35.185.76.85 - jepsen.control.util DEPRECATED: jepsen.control.util/install-tarball! is now named jepsen.control.util/install-archive!, and the `node` argument is no longer required.
WARN [2017-12-26 12:44:22,569] jepsen node 35.227.18.54 - jepsen.control.util DEPRECATED: jepsen.control.util/install-tarball! is now named jepsen.control.util/install-archive!, and the `node` argument is no longer required.
WARN [2017-12-26 12:44:22,569] jepsen node 35.190.129.71 - jepsen.control.util DEPRECATED: jepsen.control.util/install-tarball! is now named jepsen.control.util/install-archive!, and the `node` argument is no longer required.
WARN [2017-12-26 12:44:22,624] jepsen node 35.196.118.238 - jepsen.control.util DEPRECATED: jepsen.control.util/install-tarball! is now named jepsen.control.util/install-archive!, and the `node` argument is no longer required.
WARN [2017-12-26 12:44:24,407] jepsen node 35.227.18.54 - jepsen.control Encountered error with conn [:control "35.227.18.54"]; reopening
INFO [2017-12-26 12:44:25,637] jepsen node 35.196.138.29 - jepsen.cockroach.auto 35.196.138.29 Cockroach installed
INFO [2017-12-26 12:44:25,738] jepsen node 35.190.129.71 - jepsen.cockroach.auto 35.190.129.71 Cockroach installed
INFO [2017-12-26 12:44:25,739] jepsen node 35.185.76.85 - jepsen.cockroach.auto 35.185.76.85 Cockroach installed
INFO [2017-12-26 12:44:25,778] jepsen node 35.196.118.238 - jepsen.cockroach.auto 35.196.118.238 Cockroach installed
INFO [2017-12-26 12:44:32,449] jepsen node 35.196.138.29 - jepsen.cockroach.auto 35.196.138.29 clock reset: 26 Dec 12:44:32 ntpdate[5535]: step time server 91.189.91.157 offset 0.000099 sec
INFO [2017-12-26 12:44:32,549] jepsen node 35.185.76.85 - jepsen.cockroach.auto 35.185.76.85 clock reset: 26 Dec 12:44:32 ntpdate[4880]: step time server 91.189.91.157 offset 0.000003 sec
INFO [2017-12-26 12:44:32,549] jepsen node 35.190.129.71 - jepsen.cockroach.auto 35.190.129.71 clock reset: 26 Dec 12:44:32 ntpdate[6092]: step time server 91.189.91.157 offset 0.000004 sec
INFO [2017-12-26 12:44:32,589] jepsen node 35.196.118.238 - jepsen.cockroach.auto 35.196.118.238 clock reset: 26 Dec 12:44:32 ntpdate[6269]: step time server 91.189.91.157 offset -0.000012 sec
This appears to indicate some sort of failure while installing that was neither retried or logged as an exception. I suspect that something between the implicit parallelization over all nodes and the automatic retries is not layered properly, but it's hard to follow. auth.log
on the failing node doesn't say anything interesting at this time.
The "silent too long" checks are no longer present in the roachtest jepsen runner, so the effective timeouts for test setup are now more generous. We'll see if we keep running into them, but it doesn't seem to have been an issue in the past week.
The following tests appear to have failed:
#457005:
Please assign, take a look and update the issue accordingly.