"I am trying to understand the effect of schema registry on our pipeline's performance. In order to do sowe created a very simple pipeline that reads from kafka, runs a simple transformation of adding new field and writes of kafka. the messages are in avro format
I ran this pipeline with 3 different options on same configuration : 1 kafka partition, 1 task manager, 1 slot, 1 parallelism:
when i used apicurio as the schema registry i was able to process only 2000 messages per second
when i used confluent schema registry i was able to process 7000 messages per second
when I did not use any schema registry and used plain avro deserializer/serializer i was able to process 30K messages per second.
What needs to happen?
From email thread:
"I am trying to understand the effect of schema registry on our pipeline's performance. In order to do sowe created a very simple pipeline that reads from kafka, runs a simple transformation of adding new field and writes of kafka. the messages are in avro format
I ran this pipeline with 3 different options on same configuration : 1 kafka partition, 1 task manager, 1 slot, 1 parallelism:
I have made the suggested change and used
ConfluentSchemaRegistryDeserializerProvider
the results are slightly better.. average of 8000 msg/sec "We need to investigate and find out the cause of this performance issue.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components