Azure / azure-schema-registry-for-kafka

Kafka support for Azure Schema Registry.
https://aka.ms/schemaregistry
MIT License
13 stars 20 forks source link

Registration of schema on the Azure Schema Registry in Kafka Connect scenario #50

Open AndrijaRaguz opened 1 year ago

AndrijaRaguz commented 1 year ago

Could you tell me whether it is possible to setup the source connector in Kafka Connect scenarios so that it does not automatically register schemas on the Azure Schema Registry. I ask since I'm currently using com.microsoft.azure.schemaregistry.kafka.avro.AvroConverter class as a value converter in the source connector. I'm using Kafka Connect to ingest data from postgres database to a Kafka topic and I'm storing schemas in Azure Schema Registry. Automatic schema registration is not the best practice, hence I would like to avoid it. As far as I can see auto.register.schemas option was removed from configuration of connectors so I am interested; is there any alternative way to prevent connector from automatically registering schemas?

This is the configuration of my source connector:

{ "name": "jdbc-postgresql-avro-connector-azure", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "tasks.max": "1", "value.converter": "com.microsoft.azure.schemaregistry.kafka.avro.AvroConverter", "value.converter.schema.registry.url": {schemaRegistryUrl}, "value.converter.schema.group": "postgreskafkaschemagroup", "value.converter.tenant.id": {tenantId}, "value.converter.client.id": {clientId}, "value.converter.client.secret": {clientSecret}, "connection.url": {postgresConnectionUrl}, "connection.user": {user}, "connection.password": {password}, "connection.attempts": 3, "mode": "incrementing", "query": "SELECT * FROM users", "table.types": "TABLE", "topic.prefix": "users", "incrementing.column.name": "id" } }

I would appreciate any help.

OneCricketeer commented 1 year ago

You'd need to edit this line to pull the boolean from AvroConverterConfig rather than have it be hard-coded to true

https://github.com/Azure/azure-schema-registry-for-kafka/blob/master/java/avro-converter/src/main/java/com/microsoft/azure/schemaregistry/kafka/connect/avro/AvroConverter.java#L75

zhaozy93 commented 11 months ago

@OneCricketeer Have you success load the com.microsoft.azure.schemaregistry.kafka.connect.avro.AvroConverter? I am confued on loading this plugin into cluster

OneCricketeer commented 11 months ago

It's no different than other plugins mentioned in the Kafka documentation.

Build this jar using mvn package, copy it to your Connect workers under their plugin.path config directories

zhaozy93 commented 11 months ago

Thanks @OneCricketeer . Fixed it. Do you know why the shema parma I get from fromConnectData method is always a string? Do you know examples of input sources? Thanks in advance.

OneCricketeer commented 11 months ago

Unfortunately, I can't help without seeing your code.

If you use StringSerializer in a producer, you'll get a String-type Connect Schema. Try using a structured event instead

zhaozy93 commented 11 months ago

Thanks. I am trying to test this since our Schema stored in Azure, and trying to reading Avor data from Azure Eventhub to other Azure Events similar like distribute. Seems we need to do the define the Schema in poll method in source connector side when insert the message into the SourceRecord Array.

OneCricketeer commented 11 months ago

The schema or its ID should be encoded within the event itself. The converter class is not indented to be used standalone, only within a Connector, such as the original question about JDBC Source. Are you trying to write your own? Perhaps you can start a new issue thread? @zhaozy93