Closed gmungi closed 8 hours ago
hi - i'm wondering this Issue was intended for the Snowflake Kafka connector library instead?
hi - i'm wondering this Issue was intended for the Snowflake Kafka connector library instead?
@sfc-gh-dszmolka Iam not able to get the kafka consumer link..can u please hep on this
Hello @gmungi ,
Could you please create the issue in https://github.com/snowflakedb/snowflake-kafka-connector
Regards, Sujan
Hello @gmungi ,
This issue falls outside the scope of Snowflake Kafka Connector support, as it pertains to tuning the Kafka Connect framework versus other Kafka APIs. For these types of questions, Confluent support is a good resource.
Regards, Sujan
closing this issue as it should not be here.
Hi All,
We have been observing that the KafkaConsumer API is significantly slower compared to the previous low-level Kafka API we were using (e.g., FetchRequest, FetchResponse, ByteBufferMessageSet). Below is a detailed overview of the issue and the current implementation, along with an explanation of the bottlenecks and potential optimization suggestions.
Performance Issues Use Case:
The application requires fetching 1,000 messages starting from a specific user-provided offset and returning the next offset (1001) in the response. This offset will then be used as input for subsequent requests. Despite using MAX_POLL_RECORDS_CONFIG=1000, the consumer API fetches only ~300 records per poll in ~2 seconds. Fetching 1,000 records typically takes ~4 polls, resulting in a total time of ~8–10 second
I have tried different consumer settings like
MAX_PARTITION_FETCH_BYTES_CONFIG,FETCH_MIN_BYTES_CONFIG,MAX_POLL_RECORDS_CONFIG etc I have tried to increase max poll records..In 2 seconds it is not able to fetch 1000 records and returning 0 records.
Observed Delays:
Consumer Assignment and Seeking: The time taken for consumer.assign() and consumer.seek() operations adds to the overall latency. Polling: The consumer.poll() call often returns fewer records than expected, resulting in multiple iterations to achieve the required batch size.
Comparison with Low-Level API: The low-level Kafka API (e.g., FetchRequest and FetchResponse) performs better, with reduced latency for fetching records. It appears to bypass some of the high-level abstractions (e.g., consumer group coordination and offset management) that introduce overhead.
Consumer Creation Method j
public static KafkaConsumer<String, String> createConsumer(String clientName, int fetchSize) { Properties props = new Properties(); props.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaBrokerStr); props.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName()); props.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName()); props.setProperty(ConsumerConfig.GROUP_ID_CONFIG, groupId); props.setProperty(ConsumerConfig.CLIENT_ID_CONFIG, clientName); props.setProperty(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false"); props.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); props.setProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "1000"); // Security and additional properties... return new KafkaConsumer<>(props); }
public static KafkaConsumer<String,String> createConsumer(String clientName,int fetchSize) { Properties props = new Properties(); String kafkaBrokerStr = Config.getConsumerPropValue("kafkabrokerslist"); String groupId = Config.getConsumerPropValue("group.id"); props.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaBrokerStr); props.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName()); props.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
}
public List consume(long offset, String topicName,int partition,CEConsumeRequest inputReq) throws CustomException {
List msglist = new ArrayList();
Please suggest on how to improve this..