ably / kafka-connect-ably

Kafka Connector for publishing data from Kafka to Ably
https://ably.com/solutions/extend-kafka-to-the-edge
Apache License 2.0
11 stars 5 forks source link

Write integration tests for data conversions for schema #77

Open ikbalkaya opened 2 years ago

ikbalkaya commented 2 years ago

Currently we use EmbeddedConnectCluster to run our integration tests. This provides a way to run tests without having to provision an external cluster for tests.

However it looks like it is currently not possible to exchange data with schema registry - and it doesn't seem to be possible to produce data with a schema (Avro schema in the case I tried)

While trying to find a way to write unit tests I found myself using classess here https://github.com/confluentinc/schema-registry it is easy to serialize, deserialize and convert messages with Avro using this repo as it contains the AvroConverter itself. It also has MockSchemaRegistryClient that can be used to exchange schema between producers and consumers for unit tests.

I think this particular repo can be further checked to see if there are embedded classes / utilities that provide ability to wire up schemas with producers and connectors and also some utilities that provide a way to send a data with schema to Kafka.

It is also worth to check whether we can use MockSchemaRegistryClient in our current test setup

┆Issue is synchronized with this Jira Uncategorised by Unito

lmars commented 2 years ago

@ikbalkaya I provided a few pointers for how we might do this in our internal Slack channel, can you perhaps explain here the things you tried and what you couldn't quite get to work, so that anyone picking this up in future can build off the learnings you made?

ikbalkaya commented 2 years ago

@lmars I tried to summarize my findings and some potential options we can try - hopefully it is clear

lmars commented 2 years ago

@ikbalkaya here are the pointers I provided in the Slack channel:

With regards to starting a schema registry:

Here's how the schema-registry tests start one for an integration test: https://github.com/confluentinc/schema-registry/blob/master/core/src/test/java/io/confluent/kafka/schemaregistry/RestApp.java#L71-L87

and with regards to producing Avro-encoded data:

if we look at the underlying implementation of produce, I think we can just replicate that in our test code? https://github.com/apache/kafka/blob/3.1.0/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedKafkaCluster.java#L407-L414

it's just converting the strings to bytes anyway

we could init our own producer like this: https://github.com/apache/kafka/blob/3.1.0/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedKafkaCluster.java#L156-L163

Did you try out any of these approaches?

ikbalkaya commented 2 years ago

@lmars I tried both and wasn't able to combine all elements together. As far as I remember for RestApp, it wasn't possible to relate - communicate the embedded connect and Kafka clusters. For producing Avro compatible data, we have a similar issue - We can produce an Avro compatible data but it wasn't possible to add an intermediate schema-registry to exchange schemas between producer and sink connector. I was looking for an embedded schema registry server to use but couldn't find one, or the one I found wasn't compatible with the embedded Kafka cluster. I was hoping to look into this later as it was ending up taking lots of my time. But I think if there is nothing available, I might be able to create some compatible embedded schema registry that could play well with our current test setup