scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
55 stars 93 forks source link

"Parent connection is aborting on shard" should not be raised as error event #8693

Open soyacz opened 6 days ago

soyacz commented 6 days ago

As per discussion in https://github.com/scylladb/scylla-enterprise/issues/4691#issuecomment-2348867638, errors like:

Sep 05 19:19:50.107534 longevity-mv-si-4d-2024-2-db-node-608c20c6-5 scylla[7237]:  [shard 24:strm] rpc - client 10.12.11.11:50184: server stream connection dropped: Unknown parent connection 39d40018 on shard 24
Sep 05 19:19:50.115196 longevity-mv-si-4d-2024-2-db-node-608c20c6-5 scylla[7237]:  [shard 23:strm] rpc - client 10.12.11.11:49613: server stream connection dropped: Parent connection 38350017 is aborting on shard 23
Sep 05 19:19:50.115232 longevity-mv-si-4d-2024-2-db-node-608c20c6-5 scylla[7237]:  [shard 23:strm] rpc - client 10.12.11.11:52973: server stream connection dropped: Parent connection 38350017 is aborting on shard 23

should not be raised as ErrorEvent.

kbr-scylla commented 1 day ago

@gleb-cloudius please give your opinion -- doesn't this signify internal problem with Seastar RPC implementation? Especially the "Unknown parent connection" thing looks like it should never happen

(If yes, we could still treat it as error event, just a different one from the original "aborting on shard")

gleb-cloudius commented 1 day ago

Looks like both of those errors are results of the same race. "Parent connection is aborting" happen when a connection we are opening a stream on is closing and "Unknown parent connection" happens when the parent connection was already closed. Both can happen and both are benign. Stream opening should fail.

kbr-scylla commented 1 day ago

Ok, so this is no reason for error event.