NoSQL data store using the seastar framework, compatible with Apache Cassandra
GNU Affero General Public License v3.0
13.49k
stars
1.28k
forks
source link
500k errors of paxos - CAS: mutation_write_timeout_exception accompanied by semaphore timeouts #8715
Closed
bentsi closed 3 years ago
Installation details Kernel version:
5.4.0-1035-aws
Scylla version (or git commit hash):4.6.dev-0.20210525.6144656b2 with build-id bd05ced7ce11303a84d0123e35079c5ebe69bf08
Cluster size: 4 nodes (i3.2xlarge) Scylla running with shards number (live nodes):ami-08e6582348948518a
(aws: eu-north-1)Test:
longevity-lwt-3h-test
Test name:longevity_lwt_test.LWTLongevityTest.test_lwt_longevity
Test config file(s):Issue description
During the test run got 500k errors. Errors started to appear immediately after the stress workload began.
Sempahore timeouts
Restore Monitor Stack command:
$ hydra investigate show-monitor af943a81-6174-4f9d-a093-57b959cfce8e
Show all stored logs command:$ hydra investigate show-logs af943a81-6174-4f9d-a093-57b959cfce8e
Test id:
af943a81-6174-4f9d-a093-57b959cfce8e
Logs: grafana - https://cloudius-jenkins-test.s3.amazonaws.com/af943a81-6174-4f9d-a093-57b959cfce8e/20210526_071740/grafana-screenshot-longevity-lwt-3h-test-scylla-per-server-metrics-nemesis-20210526_072149-longevity-lwt-3h-master-monitor-node-af943a81-1.png db-cluster - https://cloudius-jenkins-test.s3.amazonaws.com/af943a81-6174-4f9d-a093-57b959cfce8e/20210526_072554/db-cluster-af943a81.zip loader-set - https://cloudius-jenkins-test.s3.amazonaws.com/af943a81-6174-4f9d-a093-57b959cfce8e/20210526_072554/loader-set-af943a81.zip monitor-set - https://cloudius-jenkins-test.s3.amazonaws.com/af943a81-6174-4f9d-a093-57b959cfce8e/20210526_072554/monitor-set-af943a81.zip sct-runner - https://cloudius-jenkins-test.s3.amazonaws.com/af943a81-6174-4f9d-a093-57b959cfce8e/20210526_072554/sct-runner-af943a81.zip
Jenkins job URL