Azure / azure-functions-java-library

Contains annotations for writing Azure Functions in Java
MIT License
42 stars 42 forks source link

Add Retry annotation #132

Closed pragnagopa closed 3 years ago

pragnagopa commented 3 years ago

cc @jeffhollan @TsuyoshiUshio @amamounelsayed

PR: https://github.com/Azure/azure-webjobs-sdk/pull/2463 added support for function execution retries

Tracking item to add annotations

Note: this would require updating mvn plugin to consume the new library

jeffhollan commented 3 years ago

@apawast this may be a good item to drop on our GitHub project to close out this feature

amamounelsayed commented 3 years ago

Thank you so much @pragnagopa. We will add this work item to work on it next.

pragnagopa commented 3 years ago

Tagging @casper-79 to track progress

casper-79 commented 3 years ago

Hi @amamounelsayed

Do you have an estimate on when the updated mvn plugin will be available? We are working on an application that really needs the retry annotation ..

casper-79 commented 3 years ago

Hi @jeffhollan @TsuyoshiUshio @amamounelsayed

Can any of you provide guidance on when the retry annotation will be ready to use?

jeffhollan commented 3 years ago

The base implementation in the next day or so

amamounelsayed commented 3 years ago

Thank you @casper-79 and @jeffhollan, With the base implementation, you can add manually in the host.json or function.json the retry tag, this will unblock you. Meanwhile we will add the annotation ETA mid November to support the function.json tag generation. We will update in case any delays.

casper-79 commented 3 years ago

Thanks @amamounelsayed

Is there any documentation you can point me to? I have tried to figure out how to do it based on this commit, but I am not exactly sure I did it right. Using a host.json file as seen below, I would expect failing messages to be retried in 5s, 10s, 20s, 40s, 80s ...?

{ "version": "2.0", "extensionBundle": { "id": "Microsoft.Azure.Functions.ExtensionBundle", "version": "[2.*, 3.0.0)" }, "retry": { "strategy": "exponentialBackoff", "maxRetryCount": 6, "delayInterval": "00:00:05" } }

pragnagopa commented 3 years ago

We will be posting documentation for using retry soon. Please hold off using the feature until then. Docs will be updated as soon as the functions runtime version that supports this feature is rolled out to all regions in production.

pragnagopa commented 3 years ago

@casper-79 - Docs are now public https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-error-pages?tabs=java

casper-79 commented 3 years ago

Thanks @pragnagopa

I will try it out right away!

casper-79 commented 3 years ago

Hello again @pragnagopa

I have tested the retry functionality today and seen the exponential retry strategy in action. I am also seeing some very strange behaviour, however. As I understand the documentation the retry strategy is implemented on the function instance itself, rather than storing the delivery state on the queue. I am seeing what I believe is side effects of this approach. My experiments center around submitting poisonous messages that will always fail onto a queue consumed by a Java azure function. The function uses a retry strategy defined in host.json as seen below:

"retry":{
  "strategy":"exponentialBackoff",
  "maxRetryCount":6,
  "minimumInterval":"00:00:10",
  "maximumInterval":"00:05:00"
}

(1) Processing of poisonous messages does not always show up in the Application Insights and the "monitor" section of Azure functions. When I use the Azure portal to peek look at test messages I can tell DeliveryCount has gone up by 1, but more often than not there is no trace of the failed execution that increased the counter.

(2) Azure function instances are short lived, thereby affecting the useful range of the parameters in the retry configuration parameters. Can you provide guidance on what will work in practice? I am guessing you will run into problems if you set maximumInterval to 24 hours and retryCount to 30 in my host.json?

(3) What is the recommended approach for dead lettering? The only solution I can think of is to set maxDeliveryCount=1 on the queue, but this will only work if all retry attempts of your strategy can be be performed within the typical lifetime of an instance. Otherwise, I guess the message will be retried for ever.

Regards,

Casper

jeffhollan commented 3 years ago

Gonna reply to this in another thread as not related to this issue, though arguably of these points may justify their own issue (specifically the non-durability of the current implementation) - but for now let's move this conversation here: https://github.com/Azure/azure-webjobs-sdk/issues/2595

TsuyoshiUshio commented 3 years ago

Sorry for being late. I also consider to the Kafka extension. https://github.com/Azure/azure-functions-kafka-extension/issues/122 Hi @pragnagopa For adding retry policy to an extension for C#, do we need to do something for the extension side? or just add Attribute like [FixedDelayRetry(5, "00:00:10")] works?

m-moris commented 3 years ago

When do you plan to implement retry annotation? Is there any progress?

Thanks.

amamounelsayed commented 3 years ago

It has been release, https://github.com/Azure/azure-functions-java-library/releases/tag/1.4.2 https://github.com/Azure/azure-functions-java-library/releases/tag/1.4.2-SNAPSHOT

Thank you