Closed danpaul000 closed 5 years ago
cc: @mvines @sagar-solana
regression window:
9ea398416 Sign shreds on the GPU (#6595) (1 hour, bad, 1 hour good)
50a17fc00 Use Slot and Epoch type aliases instead of raw u64 (#6693) - (2 good, 1 bad)
f9a9b7f61 Better output layout for iftop logs (#6690) - (1 good, 1 good of 1 hour)
a57f6b70d Fix swapped repair and forwards addrs (#6691) - (1 good)
bae83ba2b Compare iftop logs using log-analyzer (#6684)
385b4ce95 Get rid of verified packets and use the Meta::discard flag (#6674)
7b6e3a23b Add new pubkey to auth keys (#6687)
1cc8956f7 Get Azure provider working again (#6659)
e6c8bfd00 Add --use-move flag to cargo-install-all.sh and net/net.sh (#6670) - (3 good)
2d67962 1 hour good
Unable to reproduce.
Problem
Nightly CPU-only 5 node GCE performance testnets have a stability regression. The cluster fell over shortly into the test starting on the night of November 2.
Buildkite job: https://buildkite.com/solana-labs/system-performance-tests/builds/461 Grafana: https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge?var-testnet=gce-edge-perf-cpu-only&from=1572760888997&to=1572761835113&refresh=60s&orgId=2 Commit: https://github.com/solana-labs/solana/commit/9ea398416e5b9388d625fd5f0c5dda5312084422
The GPU enabled 5 node testnet on the same night/same commit appeared to run fine: Grafana: https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge?var-testnet=gce-edge-perf-gpu-enabled&from=1572757262197&to=1572758435533&refresh=60s&orgId=2
BK: https://buildkite.com/solana-labs/system-performance-tests/builds/460
The CPU only tests had been running with consistent results (~32-35k TPS) since at least Oct 25, before 0.20.0 was released. Last successful nightly CPU run was the night of Oct 31 against commit https://github.com/solana-labs/solana/commit/2d67962c2f70e6c4e70eece0053db96fb1d3a733
Grafana: https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge?var-testnet=gce-edge-perf-cpu-only&from=1572588077795&to=1572589027188&refresh=60s&orgId=2 BK: https://buildkite.com/solana-labs/system-performance-tests/builds/455