Who is this for and what problem do they have today?
A shortcoming of the pandaproxy-rest API, and related to the Kafka REST API, is the apparent inability to consume a stream of records when subscribed to a topic. Instead, the GET /consumers/{}/instances/{}/records API has to be repeatedly polled to consume new records.
In addition, a separate request is required to establish the consumer instance. These instances must subsequently be deleted once they are no longer being used by a client. Furthermore, if offsets are to be committed then they must also be posted in yet another http request.
What are the success criteria?
To be able to create an ephemeral consumer instance, commit offsets, and then subscribe to any number of topics in a single request. Each reply to this request is then to be made using a chunked transfer encoding so that the consumer can consume records as they become available.
Why is solving this problem impactful?
HTTP is arguably the most interoperable protocol we have and is ubiquitous among languages. Having HTTP become a first-class means of accessing Redpanda has the potential to allow more things to connect without the need for a native Kafka API client library.
Additional notes
I have prototyped an enhancement to the POST /consumers/{} API where optional offsets and subscriptions fields can be supplied to cause the record replies to be transferred using a chunked encoding. The API is backward-compatible with today's API.
Here is a sample curl command that will consume all events on an end-device-events topic:
Records are returned using the chunked transfer encoding so they may be consumed in a streaming fashion. The server does not close its half of the connection at any time.
Who is this for and what problem do they have today?
A shortcoming of the pandaproxy-rest API, and related to the Kafka REST API, is the apparent inability to consume a stream of records when subscribed to a topic. Instead, the
GET /consumers/{}/instances/{}/records
API has to be repeatedly polled to consume new records.In addition, a separate request is required to establish the consumer instance. These instances must subsequently be deleted once they are no longer being used by a client. Furthermore, if offsets are to be committed then they must also be posted in yet another http request.
What are the success criteria?
To be able to create an ephemeral consumer instance, commit offsets, and then subscribe to any number of topics in a single request. Each reply to this request is then to be made using a chunked transfer encoding so that the consumer can consume records as they become available.
Why is solving this problem impactful?
HTTP is arguably the most interoperable protocol we have and is ubiquitous among languages. Having HTTP become a first-class means of accessing Redpanda has the potential to allow more things to connect without the need for a native Kafka API client library.
Additional notes
I have prototyped an enhancement to the
POST /consumers/{}
API where optionaloffsets
andsubscriptions
fields can be supplied to cause the record replies to be transferred using a chunked encoding. The API is backward-compatible with today's API.Here is a sample curl command that will consume all events on an
end-device-events
topic:The above request may return many events. Here is a sample:
Here is another request where a specific offset is to be committed prior to subscribing:
Records are returned using the chunked transfer encoding so they may be consumed in a streaming fashion. The server does not close its half of the connection at any time.
JIRA Link: CORE-903