Closed ghaithSN closed 6 years ago
Hi there. About Schema Registry, the current API only allows you to retrieve your schema while reading your data, as show here, however, the schema registration is done "offline", like explained here.
The next version which will be available by the end of the month you'll allow you to perform schema management (register/update your schemas) from the API at write time.
About toAvro(rows: Dataset[Row], schemas: SchemasProcessor)
, this is a private method, not part of the API, whose job is to carry corresponding Spark and Avro schemas creation to partitions, through the .mapPartitions
method.
Hi there, ABRiS now has an API that allows you to append the id of your schema to the Avro payload, so that it can be consumed by Confluent tools. Also, the schema is automatically registered if not yet in Schema Registry or updated otherwise.
You will find an example here.
thank you @felipemmelo
Seems to be solve thus closing.
i'm not sure that i understood correctly; for this signature
toAvro(schemaName: String, schemaNamespace: String): Dataset[Array[Byte]]
the namespace should already exist in SchemaRegistry but the schemaName is a name we give to our new schema ( in the spark application ). unfortunately, when i did that, i realized that the schema was not added also, for this signature 'toAvro(rows: Dataset[Row], schemas: SchemasProcessor)' could you give me insights how to prepare the 'schemas' parameter ( because the SchemasProcessor has only two getters )