Open jrkinley opened 1 year ago
The consumer section only has points 1-5. Do you mean point 8 in the producer section (i.e. use of the schemaid
variable)?
I understand wanting to validate messages on the way in, but it's not obvious to me why someone would need to register a new schema at the same time as producing a message. Why wouldn't they use the schema registry API to register the schema?
@jcsp while I agree with you, I think the Pandaproxy should support both options in order to be compatible with Confluent's schema registry here (if that is the goal).
I don't see the value_schema
version (creating a schema inline with a produce) in the linked confluent docs page? I believe you, but it would be good to have a link to the docs that describe it.
CC @mattschumpert for awareness on the question of whether our API should aim to be wire-compatible with confluent's -- this particular request (creating schemas via pandaproxy) is an example of something that seems quirky and we probably wouldn't do otherwise.
@jcsp see the examples in their proxy quick start guide: https://docs.confluent.io/platform/current/kafka-rest/quickstart.html#produce-and-consume-avro-messages
There may be a misunderstanding, we'd want to send with value_schema_id
, not the actual schema. The website didn't quite point to the right place on that
Upvoting the issue.
Also need such feature.
Produce and Consume Avro Messages
Produce a message using Avro embedded data, including the schema which will be registered with schema registry and used to validate and serialize before storing the data in Kafka
curl -X POST -H "Content-Type: application/vnd.kafka.avro.v2+json" \ -H "Accept: application/vnd.kafka.v2+json" \ --data '{"value_schema": "{\"type\": \"record\", \"name\": \"User\", \"fields\": [{\"name\": \"name\", \"type\": \"string\"}]}", "records": [{"value": {"name": "testUser"}}]}' \ "http://localhost:8082/topics/avrotest"
Pandaproxy does not provide the ability to validate incoming messages using Avro, Protobuf, or JSON schema stored in the schema registry.
This feature request is to add support for the
value_schema
andvalue_schema_id
fields so that HTTP clients can post either the full schema or an existing schema ID alongside the records. Pandaproxy shall register the provided schema with the schema registry (in the case ofvalue_schema
) or retrieve the existing schema (in the case ofvalue_schema_id
) and use the schema to validate and serialise messages before storing them in Redpanda.For example, this
curl
command should result in thevalue_schema
schema being registered in the schema registry and used to validate and serialise the list of records before storing them in Redpanda. Pandaproxy should include thevalue_schema_id
in the response:In subsequent messages only
{"value_schema_id": 1, "records":[...]}
need be provided and Pandaproxy will fetch the corresponding schema from the schema registry if it isn't cached.JIRA Link: CORE-1018