scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
56 stars 93 forks source link

cassandra-stress failure not always should be CRITICAL #2162

Closed juliayakovlev closed 1 year ago

juliayakovlev commented 4 years ago

Prerequisites

Versions

Logs

My test failed due the critical failure that shouldn't be critical in my case and shouldn't fail the test.

It's a large partition test that uses a scylla-bench. During the test disrupt_nodetool_refresh nemesis was started. Because of keyspace keyspace1 doesn't exist, the cassandra-stress command was started. Started and failed with timeout. And this failure stopped my test with CRITICAL:

< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG > Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException: [/10.0.56.242:9042] Timed out waiting for server response
< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG >   at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onTimeout(RequestHandler.java:825)
< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG >   at com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:1392)
< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG >   at com.datastax.shaded.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:663)
< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG >   at com.datastax.shaded.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:738)
< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG >   at com.datastax.shaded.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:466)
< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG >   at com.datastax.shaded.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
< t:2020-05-19 09:42:45,633 f:remote.py       l:164  c:sdcm.remote          p:DEBUG >   at java.lang.Thread.run(Thread.java:748)
< t:2020-05-19 09:42:45,647 f:sct_events.py   l:887  c:sdcm.sct_events      p:INFO  > 2020-05-19 09:42:45.633: (CassandraStressEvent Severity.CRITICAL): type=failure node=Node longevity-large-partitions-200k-pks-loader-node-3243f275-1 [54.217.5.149 | 10.0.241.251] (seed: False)
< t:2020-05-19 09:42:45,647 f:sct_events.py   l:887  c:sdcm.sct_events      p:INFO  > Stress command completed with bad status 1: java.lang.RuntimeException: Encountered exception creating schema
< t:2020-05-19 09:42:45,647 f:sct_events.py   l:887  c:sdcm.sct_events      p:INFO  >   at org.apache.cassandra.stress.se
< t:2020-05-19 09:42:45,662 f:sct_events.py   l:887  c:sdcm.sct_events      p:INFO  > 2020-05-19 09:42:45.648: (InfoEvent Severity.NORMAL): message=TEST_END

In this case it's not so important and the test couldn't fail

fruch commented 4 years ago

right now we don't have a way to mark which stress is the the impotent ones. but we can add a filter that can change the severity, and make those warning as example.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 2 years with no activity. Remove stale label or comment or this will be closed in 2 days.