Azure / azure-functions-host

The host/runtime that powers Azure Functions
https://functions.azure.com
MIT License
1.93k stars 441 forks source link

Function failure behaviour #4

Closed alexanderweiss closed 8 years ago

alexanderweiss commented 8 years ago

When using a queue as input for a "normal" WebJobs SDK function, the queue message will not be dequeued if the function fails, but will be retried and eventually added to a poisoned queue (https://azure.microsoft.com/en-us/documentation/articles/websites-dotnet-webjobs-sdk-storage-queues-how-to/#poison).

WebJobs.Script currently requires failures to be handled manually: throwing an error in a Node.JS function will simply crash the whole process. And the message will be removed from the queue nonetheless.

It would be great if this actually worked similar to the WebJobs SDK.

mathewc commented 8 years ago

Poison queue handling is handled the same way as the SDK. For example, to verify, I tried both sending malformed json as well as throwing an exception from the Node QueueTrigger sample and verified that it creates the samples-workitems-poison queue and copies the message to it. The process continues to run.

In my case, I have the MaxDequeueCount set to the default of 5, so it'll try 5 times before moving to the poison queue.

If this isn't working for you, can you provide me exact repro steps please?

alexanderweiss commented 8 years ago

Thanks for taking a look. I tried what you said and that did work. So I've been trying out a few things and it seems that asynchronously thrown errors are the problem. Specifically, I was using the mandrill-api module and throwing an error when my callback was called with an error. That causes the issues someFunction(function(err) { throw new Error('Some error') })

More generally using setTimeout to fake asynchronous code (using only a queueTrigger input and no output): The following code fails nicely (doesn't dequeue the message and will add it to the poisoned queue after 5 tries):

'use strict';

module.exports = function (context) {
    context.log('Start')

    throw new Error('Timeout error')

}

The following does not fail nicely (dequeues the message and crashes the WebJob process):

'use strict';

module.exports = function (context) {
    context.log('Start')

    setTimeout(function() {
        throw new Error('Timeout error')
    }, 1000)

}

Am I making a mistake or wrong assumption here?

mathewc commented 8 years ago

Yes, global unhandled exceptions like that will bring the process down, but the Azure WebJobs infrastructure will start it back up. That's the same behavior you get for continuous WebJobs in general.

For Node.js, I think the recommended best practice for unhandled exceptions like this is to allow the node.exe process to come down and restart. Generally that's the only safe thing to do. Now you can wrap your own code in try/catch and handle errors you might expect in your application code to prevent a process recycle in cases where you know how to handle errors, but the host shouldn't do this globally on your behalf.

You should see the global exception details written to your webjob logs though - please verify that you do (in your SCM portal, under data/jobs/triggered/{jobName}).

alexanderweiss commented 8 years ago

Okay. I understand the issue.

How can I then get an asynchronous job to fail nicely? For a synchronous job that would throwing an error, but there's no way to do it as a result of async code (I can catch any errors, but can't do anything with them)? E.g. calling context.done() with an error argument.

(Yes, I do get everything in the logs.)

mathewc commented 8 years ago

For error cases, you can call context.done(err) passing in an error object, string etc. That will be interpreted as an error condition, the function invocation will fail, and the logs will contain the error info. Let me know if that works for you.

alexanderweiss commented 8 years ago

Ah, so it was possible. Sorry... I was looking through the code but couldn't find it. I don't think it's documented is it? Would be a good addition.

It works perfectly this way.

mathewc commented 8 years ago

Not documented yet which is why you did know :) Still playing doc catch up on stuff. I'll be sure to add this.


From: Alexander Weissmailto:notifications@github.com Sent: ‎12/‎1/‎2015 1:55 PM To: Azure/azure-webjobs-sdk-scriptmailto:azure-webjobs-sdk-script@noreply.github.com Cc: Mathew Charlesmailto:Mathew.Charles@microsoft.com Subject: Re: [azure-webjobs-sdk-script] Function failure behaviour (#4)

Ah, so it was possible. Sorry... I was looking through the code but couldn't find it. I don't think it's documented is it? Would be a good addition.

It works perfectly this way.

— Reply to this email directly or view it on GitHubhttps://github.com/Azure/azure-webjobs-sdk-script/issues/4#issuecomment-161107959.

keremcankaya0 commented 7 years ago

I've spent quite a lot of time to figure this behaviour. It's still not updated today.