Open minghuaw opened 1 year ago
A short guide is provided below for folks that are interested in working on this issue.
The Event Processor client library is a companion to the Azure Event Hubs client library, providing a stand-alone client for consuming events in a robust, durable, and scalable way that is suitable for the majority of production scenarios.
The processor client should
... manage the responsibilities associated with connecting to a given Event Hub and processing events from each of its partitions, in the context of a specific consumer group. The act of processing events read from the partition and handling any errors that occur is delegated by the event processor to code that you provide ...
... Checkpointing is the responsibility of the consumer and occurs on a per-partition, typically in the context of a specific consumer group. For the EventProcessorClient, this means that, for a consumer group and partition combination, the processor must keep track of its current position in the event stream ...
Let's break this apart.
Connecting to Azure Event Hubs for creating a consumer with either a connection string or some other supported Azure Identity is already taken care of by EventHubConnection
(https://docs.rs/azeventhubs/latest/azeventhubs/struct.EventHubConnection.html or https://github.com/minghuaw/azeventhubs/blob/main/src/event_hubs_connection.rs), which simply wraps an AmqpClient
(https://github.com/minghuaw/azeventhubs/blob/main/src/amqp/amqp_client.rs).
Because the dotnet SDK uses Azure Storage Blobs as the durable data store for checkpointing, the azure_storage_blobs
crate (https://docs.rs/azure_storage_blobs/latest/azure_storage_blobs/) could be used.
Without considering load balancing across multiple processor clients running on separate processes or machines, the processor client should allow user add an event processing function (probably an async function). Something similar to below
pub fn process_event_handler<F, Fut>(f: F)
where
F: Fn(ProcessEventArgs) -> Fut,
Fut: std::future::Future<Output = Result<(), ProcessorError>>
{
todo!()
}
ProcessEventArgs
is a new type that should allow user to get a reference to the underlying message (https://docs.rs/azeventhubs/latest/azeventhubs/struct.ReceivedEventData.html or https://github.com/minghuaw/azeventhubs/blob/46a4c32b19445f247247f53e3443065871bc2c66/src/event_data.rs#L100)
The error type ProcessorError
could be limited to simply azure_core::Error
or make it generic type so that user can choose their own error type. But this error type might be required to be able to convert into ProcessErrorEventArgs
(see below)
The processor client should then allow user to add a handler to process errors (whether this should be async is probably debatable)
pub fn process_error_handler<F, Fut>(f: F)
where
F: Fn(ProcessErrorEventArgs) -> Fut,
Fut: std::future::Future<Output = Result<(), ProcessorErrorEventError>>
{
todo!()
}
ProcessErrorEventArgs
is a new type that contains the error returned from the event handler. Errors with the underlying connection that is not fixed by retrying should probably be returned here as well, which is an azure_core::Error
. The returned result may be used to indicate whether to stop the processor if an error cannot be handled (debatable)?
Let's not consider load balancing across multiple clients yet. Start processing is probably just first obtaining a list of partitions and then spawning a tokio task for each partition. A tokio::task::JoinSet
is probably useful here so that we don't need to keep track of each task handle manually.
Each event processing task should periodically create checkpoints with Azure Storage Blobs (https://docs.rs/azure_storage_blobs/latest/azure_storage_blobs/) or maybe other kind of persistent storage (abstracting this into a public trait). This may also have further implications for consumer auto recovery, which is something to consider later.
This could be implemented as a separate crate. It might be worth to put everything related to AMQP in azeventhubs
to a separate shared crate
tokio::task::JoinSet
might be useful