Closed pkgonan closed 3 years ago
@avelanarius @haaawk Hi. Is there an expected problem with the above issue?
@pkgonan It might be related to the cross-DC latency. Could you please increase scylla.confidence.window.size to 30s and see if you still observe the problem?
@haaawk Thank you very much. After changed configuration to scylla.query.time.window.size : 30 and scylla.confidence.window.size : 30, it works well.
@haaawk I think max.response.time = scylla.query.time.window.size + scylla.confidence.window.size.
If the maximum response time is 60 seconds(30s + 30s) , it seems too long. I would like to receive cdc events in the near real time as possible. How is the best way to apply the scylla.query.time.window.size and scylla.confidence.window.size settings in multiple datacenter environment?
Try setting them both to 10s.
scylla.query.time.window.size
should not have a significant effect on the lag, the lag should always be roughly equal to scylla.confidence.window.size
. So your lag should be ~10s if you set scylla.confidence.window.size
to 10s. Now you could try experimenting and setting it to a lower value, but you should check what your inter-DC latencies are, see how long it takes for a row to show up in another DC, and adjust for that. But be wary that latencies may spike due to network interruptions so the lower you set confidence.window.size
the higher the chance that a row will reach a remote DC later so the connector's query window will miss it. It's a tradeoff between consistency and availability - if you're OK with some rows being missed by the connector during network-problematic periods, you may lower the window size.
@kbr- Thank you. I have additional questions. I tried some test based on the environment settings below.
When scylla.query.time.window.size is set to 5000, ScyllaDB receives 500 ~ 600 read requests per second. It seems that the load to read the CDC log loaded in the CDC table is too much than expected. If the Scylla source connector is configured separately for each table, it seems that too many requests to read the CDC log will occur.
We plan to add 1 Scylla CDC Connector for each Keyspace to separate the internal keyspace access rights. If two tables in one keyspace need to collect CDC logs, register in the same connector to collect CDC logs.
One problem arises in the above situation. If there is 1 Scylla CDC Connector with scylla.query.time.window.size setting of 5000, 500~600 too many requests are generated, causing Scylla's load to increase rapidly.
If a CDC Connector is registered for each of the 3 Kespaces, 1500~1800 read requests are generated to read the CDC Log.
Can you effectively reduce the read load? It doesn't seem appropriate to give up collecting CDC logs at near real-time speeds by increasing scylla.query.time.window.size to reduce the read load.
[Multi DC Environment]
CREATE KEYSPACE IF NOT EXISTS test_service WITH REPLICATION = {
'class' : 'NetworkTopologyStrategy',
'us-west-1' : 3,
'eu-central-1' : 3
};
[Connector Settings]
"tasks.max": "3",
"scylla.query.time.window.size": "5000",
"scylla.confidence.window.size":"2000"
[Traffic in us-west-1]
I'm not sure what's the exact problem is @pkgonan. If you need the changes to be handled shortly after they appear, the connector needs to frequently check (read) the CDC Log.
What's your write throughput? If you're writing a lot then frequent connector reads are exactly what's needed. If you're not writing much then the connector may be reading too frequently. We're working on an optimisation for this low write scenario but it's not yet there. At the moment all you can do is increase the scylla.query.time.window.size which will cause the connector to read less frequently but in bigger chunks.
Hi. cdc events are not sent to kafka. When we tested in dev environemnt (SIngle DC) worked well. But in production environment (Multi DC) did not work.
When create & update & delete command is executed in my_table (CDC Enabled), cdc log is generated to my_table_scylla_cdc_log successfully. But cdc event not sent to kafka topic. However, heartbeat event is produced to kafka successfully. (Kafka Topic : __debezium-heartbeat.cdc-data.test)
If an error log is detected, we can tell what the problem is, but it is difficult to know what the problem is because the error log does not occur.
[Versions]
[Configs - Same in all environments.]
[Dev Environment Config - Single DC]
[Production Environment Config - Multi DC]
[Confluent Kafka Connect Log]
[Below is a log that does not occur often and only occurs once.]