Open cc32d9 opened 1 year ago
the writer in question: https://github.com/EOSChronicleProject/chronos/blob/main/writer/exp_chronos_plugin.cpp
cass_cluster_set_local_port_range(cluster, 49152, 65535);
cass_cluster_set_core_connections_per_host(cluster, scylla_conn_per_host);
cass_cluster_set_request_timeout(cluster, 100000);
cass_cluster_set_num_threads_io(cluster, scylla_io_threads);
cass_cluster_set_queue_size_io(cluster, 1048576);
scylla_conn_per_host is set to 1, and scylla_io_threads=4.
[48556796.397656] chronos-writer[1981192]: segfault at 8 ip 00007f60800c5af8 sp 00007f5c777fa9d0 error 4 in libscylla-cpp-driver.so.2.16.2-b[7f607ffd2000+293000]
[48556796.397677] Code: 01 00 00 4c 8d 2c d0 0f 85 4e 02 00 00 49 8b 6d 08 4d 8b 75 00 49 39 ee 0f 84 ac 00 00 00 4d 89 f4 49 83 c6 08 4c 39 f5 74 48 <49> 8b 3e e8 00 54 08 00 84 c0 75 eb 49 8b 3c 24 e8 f3 53 08 00 84
@jul-stas ^^
I will collect more data when my current test finishes. But this is what I observed with 2.16.2b:
a cluster of 4 machines, running latest 5.2 release candidate. The keyspace has replication factor 3. The writer is pushing about 30k inserts per second, with consistency level set to QUORUM.
While the client is stopped, I stopped one of the servers. Then I started the client (all 4 are configured in cluster contact points). The client complained a bit about failed connections, but went chugging along, as we have enough replicas for the quorum.
Then I started the server that was stopped, and the client segfaulted immediately as the server started accepting connections.