Open dimakr opened 7 months ago
@dimakr can you look into db nodes to get the db errors and list here error lines?
@soyacz Find please below errors in db-nodes system.log in sct-results:
❯ find ~/sct-results/latest/longevity-1gb-1h-nemesis-dmitriy-db-cluster-687b0d24/ -type f -exec grep -Eir "error|fail" {} \; -exec echo "===========" \;
ts=2024-04-12T09:56:19.994Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
WARN 2024-04-12 09:56:49,681 [shard 0:n/a ] seastar - Creation of perf_event based stall detector failed: falling back to posix timer: std::system_error (error system:1, perf_event_open() failed: Operation not permitted)
INFO 2024-04-12 09:56:50,147 [shard 0:main] init - starting direct failure detector pinger service
INFO 2024-04-12 09:56:50,147 [shard 0:main] init - starting direct failure detector service
INFO 2024-04-12 09:56:50,243 [shard 0:stre] gossip - Feature check passed. Local node 172.17.0.2 features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {}
INFO 2024-04-12 09:56:50,250 [shard 0:stre] gossip - failure_detector_loop: Started main loop
INFO 2024-04-12 09:57:02,348 [shard 0:stre] features - Feature TYPED_ERRORS_IN_READ_RPC is enabled
WARN 2024-04-12 10:06:06,238 [shard 0:stre] gossip - failure_detector_loop: Got error in the loop, live_nodes={172.17.0.4, 172.17.0.3}: seastar::sleep_aborted (Sleep is aborted)
INFO 2024-04-12 10:06:06,238 [shard 0:stre] gossip - failure_detector_loop: Finished main loop
WARN 2024-04-12 10:06:06,238 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
WARN 2024-04-12 10:06:06,239 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
WARN 2024-04-12 10:06:06,463 [shard 0:goss] gossip - === Gossip round FAIL: seastar::gate_closed_exception (gate closed)
INFO 2024-04-12 10:06:08,986 [shard 0:main] init - Shutting down direct_failure_detector
INFO 2024-04-12 10:06:08,986 [shard 0:main] init - Shutting down direct_failure_detector was successful
ts=2024-04-12T10:06:51.146Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
WARN 2024-04-12 10:07:08,449 [shard 0:n/a ] seastar - Creation of perf_event based stall detector failed: falling back to posix timer: std::system_error (error system:1, perf_event_open() failed: Operation not permitted)
INFO 2024-04-12 10:07:08,774 [shard 0:main] init - starting direct failure detector pinger service
INFO 2024-04-12 10:07:08,774 [shard 0:main] init - starting direct failure detector service
INFO 2024-04-12 10:07:08,831 [shard 0:stre] gossip - Feature check passed. Local node 172.17.0.2 features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {}
INFO 2024-04-12 10:07:08,834 [shard 0:stre] gossip - failure_detector_loop: Started main loop
INFO 2024-04-12 10:07:20,920 [shard 0:stre] features - Feature TYPED_ERRORS_IN_READ_RPC is enabled
===========
ts=2024-04-12T09:56:27.752Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
WARN 2024-04-12 09:59:14,473 [shard 0:n/a ] seastar - Creation of perf_event based stall detector failed: falling back to posix timer: std::system_error (error system:1, perf_event_open() failed: Operation not permitted)
INFO 2024-04-12 09:59:14,823 [shard 0:main] init - starting direct failure detector pinger service
INFO 2024-04-12 09:59:14,823 [shard 0:main] init - starting direct failure detector service
INFO 2024-04-12 09:59:14,930 [shard 0:stre] gossip - Feature check passed. Local node 172.17.0.4 features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}
INFO 2024-04-12 09:59:14,932 [shard 0:stre] gossip - failure_detector_loop: Started main loop
INFO 2024-04-12 09:59:28,464 [shard 0:stre] features - Feature TYPED_ERRORS_IN_READ_RPC is enabled
WARN 2024-04-12 10:06:27,872 [shard 0:stre] repair - repair[0af3616d-a2fd-4d74-b388-1c34d3f531d4]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_members, role_attributes}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown
WARN 2024-04-12 10:06:27,872 [shard 0:stre] repair - repair[0af3616d-a2fd-4d74-b388-1c34d3f531d4]: user-requested repair failed: std::runtime_error ({shard 0: std::runtime_error (repair[0af3616d-a2fd-4d74-b388-1c34d3f531d4]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_members, role_attributes}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown)})
WARN 2024-04-12 10:46:39,416 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
WARN 2024-04-12 10:46:39,416 [shard 0:stre] gossip - failure_detector_loop: Got error in the loop, live_nodes={172.17.0.3}: seastar::sleep_aborted (Sleep is aborted)
INFO 2024-04-12 10:46:39,416 [shard 0:stre] gossip - failure_detector_loop: Finished main loop
WARN 2024-04-12 10:46:39,426 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
WARN 2024-04-12 10:46:40,327 [shard 0:goss] gossip - === Gossip round FAIL: seastar::gate_closed_exception (gate closed)
WARN 2024-04-12 10:46:41,321 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
INFO 2024-04-12 10:46:41,866 [shard 0:main] init - Shutting down direct_failure_detector
INFO 2024-04-12 10:46:41,866 [shard 0:main] init - Shutting down direct_failure_detector was successful
===========
ts=2024-04-12T09:56:23.634Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
WARN 2024-04-12 09:58:02,187 [shard 0:n/a ] seastar - Creation of perf_event based stall detector failed: falling back to posix timer: std::system_error (error system:1, perf_event_open() failed: Operation not permitted)
INFO 2024-04-12 09:58:02,680 [shard 0:main] init - starting direct failure detector pinger service
INFO 2024-04-12 09:58:02,680 [shard 0:main] init - starting direct failure detector service
INFO 2024-04-12 09:58:02,749 [shard 0:stre] gossip - Feature check passed. Local node 172.17.0.3 features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}
INFO 2024-04-12 09:58:02,751 [shard 0:stre] gossip - failure_detector_loop: Started main loop
INFO 2024-04-12 09:58:16,434 [shard 0:stre] features - Feature TYPED_ERRORS_IN_READ_RPC is enabled
WARN 2024-04-12 10:06:16,457 [shard 0:stre] repair - repair[480ff0fc-f8a4-4229-ac6e-dacbbf2f743b]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_attributes, role_members}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown
WARN 2024-04-12 10:06:16,458 [shard 0:stre] repair - repair[480ff0fc-f8a4-4229-ac6e-dacbbf2f743b]: user-requested repair failed: std::runtime_error ({shard 0: std::runtime_error (repair[480ff0fc-f8a4-4229-ac6e-dacbbf2f743b]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_attributes, role_members}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown)})
===========
Also some grep through logs of db-node docker containers themselves:
❯ docker ps --format 'table {{.ID}}\t{{.Image}}\t{{.CreatedAt}}\t{{.Names}}' | grep scylla-db
31d07e17b832 scylla-sct:scylla-db-687b0d24 2024-04-12 12:06:49 +0200 CEST longevity-1gb-1h-nemesis-dmitriy-db-node-687b0d24-0
2ce4ac7b01d7 scylla-sct:scylla-db-687b0d24 2024-04-12 11:56:26 +0200 CEST longevity-1gb-1h-nemesis-dmitriy-db-node-687b0d24-2
490b929ca57f scylla-sct:scylla-db-687b0d24 2024-04-12 11:56:21 +0200 CEST longevity-1gb-1h-nemesis-dmitriy-db-node-687b0d24-1
❯ docker logs 31d07e17b832 2>&1 | grep -iE 'error|fail|timed|except'
ts=2024-04-12T10:06:51.146Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
WARN 2024-04-12 10:07:08,449 [shard 0:n/a ] seastar - Creation of perf_event based stall detector failed: falling back to posix timer: std::system_error (error system:1, perf_event_open() failed: Operation not permitted)
INFO 2024-04-12 10:07:08,774 [shard 0:main] init - starting direct failure detector pinger service
INFO 2024-04-12 10:07:08,774 [shard 0:main] init - starting direct failure detector service
INFO 2024-04-12 10:07:08,831 [shard 0:stre] gossip - Feature check passed. Local node 172.17.0.2 features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {}
INFO 2024-04-12 10:07:08,834 [shard 0:stre] gossip - failure_detector_loop: Started main loop
INFO 2024-04-12 10:07:20,920 [shard 0:stre] features - Feature TYPED_ERRORS_IN_READ_RPC is enabled
❯ docker logs 2ce4ac7b01d7 2>&1 | grep -iE 'error|fail|timed|except'
ts=2024-04-12T09:56:27.752Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
WARN 2024-04-12 09:59:14,473 [shard 0:n/a ] seastar - Creation of perf_event based stall detector failed: falling back to posix timer: std::system_error (error system:1, perf_event_open() failed: Operation not permitted)
INFO 2024-04-12 09:59:14,823 [shard 0:main] init - starting direct failure detector pinger service
INFO 2024-04-12 09:59:14,823 [shard 0:main] init - starting direct failure detector service
INFO 2024-04-12 09:59:14,930 [shard 0:stre] gossip - Feature check passed. Local node 172.17.0.4 features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}
INFO 2024-04-12 09:59:14,932 [shard 0:stre] gossip - failure_detector_loop: Started main loop
INFO 2024-04-12 09:59:28,464 [shard 0:stre] features - Feature TYPED_ERRORS_IN_READ_RPC is enabled
WARN 2024-04-12 10:06:27,872 [shard 0:stre] repair - repair[0af3616d-a2fd-4d74-b388-1c34d3f531d4]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_members, role_attributes}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown
WARN 2024-04-12 10:06:27,872 [shard 0:stre] repair - repair[0af3616d-a2fd-4d74-b388-1c34d3f531d4]: user-requested repair failed: std::runtime_error ({shard 0: std::runtime_error (repair[0af3616d-a2fd-4d74-b388-1c34d3f531d4]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_members, role_attributes}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown)})
WARN 2024-04-12 10:46:39,416 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
WARN 2024-04-12 10:46:39,416 [shard 0:stre] gossip - failure_detector_loop: Got error in the loop, live_nodes={172.17.0.3}: seastar::sleep_aborted (Sleep is aborted)
INFO 2024-04-12 10:46:39,416 [shard 0:stre] gossip - failure_detector_loop: Finished main loop
WARN 2024-04-12 10:46:39,426 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
INFO 2024-04-12 10:46:39,429 [shard 0:comp] compaction - [Compact keyspace1.standard1 ef184f10-f8b9-11ee-a7de-7ac0e72dd805] Compacting of 2 sstables interrupted due to: sstables::compaction_stopped_exception (Compaction for keyspace1/standard1 was stopped due to: shutdown)
INFO 2024-04-12 10:46:39,535 [shard 0:goss] rpc - client 172.17.0.3:58811 msg_id 2: exception "gate closed" in no_wait handler ignored
WARN 2024-04-12 10:46:40,327 [shard 0:goss] gossip - === Gossip round FAIL: seastar::gate_closed_exception (gate closed)
INFO 2024-04-12 10:46:40,535 [shard 0:goss] rpc - client 172.17.0.3:58811 msg_id 4: exception "gate closed" in no_wait handler ignored
WARN 2024-04-12 10:46:41,321 [shard 0:main] gossip - Fail to apply application_state: seastar::abort_requested_exception (abort requested)
INFO 2024-04-12 10:46:41,866 [shard 0:main] init - Shutting down direct_failure_detector
INFO 2024-04-12 10:46:41,866 [shard 0:main] init - Shutting down direct_failure_detector was successful
❯ docker logs 490b929ca57f 2>&1 | grep -iE 'error|fail|timed|except'
ts=2024-04-12T09:56:23.634Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
WARN 2024-04-12 09:58:02,187 [shard 0:n/a ] seastar - Creation of perf_event based stall detector failed: falling back to posix timer: std::system_error (error system:1, perf_event_open() failed: Operation not permitted)
INFO 2024-04-12 09:58:02,680 [shard 0:main] init - starting direct failure detector pinger service
INFO 2024-04-12 09:58:02,680 [shard 0:main] init - starting direct failure detector service
INFO 2024-04-12 09:58:02,749 [shard 0:stre] gossip - Feature check passed. Local node 172.17.0.3 features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {AGGREGATE_STORAGE_OPTIONS, ALTERNATOR_TTL, CDC, CDC_GENERATIONS_V2, COLLECTION_INDEXING, COMPUTED_COLUMNS, CORRECT_COUNTER_ORDER, CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, CORRECT_STATIC_COMPACT_IN_MC, COUNTERS, DIGEST_FOR_NULL_VALUES, DIGEST_INSENSITIVE_TO_EXPIRY, DIGEST_MULTIPARTITION_READ, EMPTY_REPLICA_MUTATION_PAGES, EMPTY_REPLICA_PAGES, HINTED_HANDOFF_SEPARATE_CONNECTION, INDEXES, LARGE_COLLECTION_DETECTION, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, LWT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, ME_SSTABLE_FORMAT, NONFROZEN_UDTS, PARALLELIZED_AGGREGATION, PER_TABLE_CACHING, PER_TABLE_PARTITIONERS, RANGE_SCAN_DATA_VARIANT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_COMMITLOG, SCHEMA_TABLES_V3, SECONDARY_INDEXES_ON_STATIC_COLUMNS, SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT, STREAM_WITH_RPC_STREAM, SUPPORTS_RAFT_CLUSTER_MANAGEMENT, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, TOMBSTONE_GC_OPTIONS, TRUNCATION_TABLE, TYPED_ERRORS_IN_READ_RPC, UDA, UDA_NATIVE_PARALLELIZED_AGGREGATION, UNBOUNDED_RANGE_TOMBSTONES, UUID_SSTABLE_IDENTIFIERS, VIEW_VIRTUAL_COLUMNS, WRITE_FAILURE_REPLY, XXHASH}
INFO 2024-04-12 09:58:02,751 [shard 0:stre] gossip - failure_detector_loop: Started main loop
INFO 2024-04-12 09:58:16,434 [shard 0:stre] features - Feature TYPED_ERRORS_IN_READ_RPC is enabled
WARN 2024-04-12 10:06:16,457 [shard 0:stre] repair - repair[480ff0fc-f8a4-4229-ac6e-dacbbf2f743b]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_attributes, role_members}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown
WARN 2024-04-12 10:06:16,458 [shard 0:stre] repair - repair[480ff0fc-f8a4-4229-ac6e-dacbbf2f743b]: user-requested repair failed: std::runtime_error ({shard 0: std::runtime_error (repair[480ff0fc-f8a4-4229-ac6e-dacbbf2f743b]: 2307 out of 2307 ranges failed, keyspace=system_auth, tables={roles, role_attributes, role_members}, repair_reason=repair, nodes_down_during_repair={172.17.0.2}, aborted_by_user=false, failed_because=unknown)})
TerminateAndRemoveNodeMonkey nemesis case is failing on attempts to repair a node after the disruption, with the error:
Installation details
SCT Version: master Scylla version: 2024.1.2-0.20240228.2c85a811d0be Test:
longevity-5gb-1h-nemesis
Test config: configurations/nemesis/additional_configs/docker_backend_local.yamlLogs
TerminateAndRemoveNodeMonkey Jenkins job url