Azure / azure-documentdb-changefeedprocessor-dotnet

This library provides a host for distributing change feed events in partitioned collection across multiple observers. Instances of the host can scale up (by adding) or down (by removing) dynamically, and the load will be automatically distributed among active instances in about-equal way.
Other
51 stars 22 forks source link

Failure state tracking #157

Open steeling opened 3 years ago

steeling commented 3 years ago

Hi there!

I've been looking into cosmos DB's change feed integration with Azure Functions, and have been putting some thought into failure scenarios that may arise. Dropping documents changes on the floor is a concern of ours, and have some thoughts on how to mitigate.

Correct me if I'm wrong, but I believe that the change feed processor operates serially over each partition. meaning an error in ProcessChangesAsync actually holds up the processing of future items. (The integration with Azure functions does a fire-and-forget, and keeps trucking on though). Even if we were to not await on ProcessChangesAsync, then we're dropping errors on the floor.

My proposal is to keep the lease management as is, but also write a series of separate docs called "failed messages" -- with a retry count, and adding a flag called max_retries.

The change feed processor would then be set up to also read the change feed of it's own lease container (or these failures could be written to a separate container, so it doesn't get the reads of the other lease info).

This new change feed would be used to process failed messages, retrying them up to a retry count, then moving on to the next. The key part is that it would remove the item from this container on a success.

The onus of retrying items past the retry count would be on the user, where they could reset the retry_count back to zero, putting back into the change feed, and allowing it to be processed as normal.

Happy to draw up diagrams, write up a more formal doc, or contribute code, but wanted to surface this request first to garner interest.