Hi,
we are working about challenging JCO connection to Kafka with this kafka-connect-sap on a kubernetes pod.
Looking the example, why are you using Debezium kafka connect docker image and not only a kafka connect docker image ? It's very troubling because the connector do not do CDC features.
Can we configure the number of sample get by the select in "table mode" and the delay between each pool of messages to increase the troughtput ?
Many of the standard SAP table are not partionned ( aka MARC, EKKO etc.. ). It seems that it is not able to dispatch message in many partition !!! Is it right ?
One of the reasons why debezium base is used in some examples is that the HANA connector is frequently used as the sink connector with another debezium CDC source connector. In this case, the HANA sink connector can use debezium SMT to convert the CDC records to the normal row records. In the examples, there are examples with strimzi and debezium. But you can use any base image suitable for your environment.
The maximal number of records per poll can be configured in batch.max.rows. In general, using polling many rows as possible should lead to a better performance than using a smaller batch size. But currently, there are a few known issues that limit the throughput, which need to be improved.
Correct. Currently, there is a relationship between the source table's partition and kafka topic partition. This restriction can ensure the order of the rows per partition. Someone asked to loosen this restriction. We thought about adding an option to define a partition function to customise the partition but we didn't look into it because If the source table has one partition and someone wants to use multiple kafka partitions, they could write their own custom Kafka SMT to customise the partition of those records. Will this also be an option for your use case?
Hi, we are working about challenging JCO connection to Kafka with this kafka-connect-sap on a kubernetes pod.
Thanks