Azure / azure-webjobs-sdk

Azure WebJobs SDK
MIT License
739 stars 358 forks source link

Continuous Web Job with a TimerTrigger crashes with code -532462766 #1085

Open relusion opened 7 years ago

relusion commented 7 years ago

Hi there, we have a web job running Continuously with several callbacks. From time to time the job just crashes with the following traces:

[03/24/2017 03:50:08 > 823637: SYS INFO] WebJob is still running [03/24/2017 04:35:26 > 0f5c55: SYS ERR ] Job failed due to exit code -532462766 [03/24/2017 04:35:26 > 0f5c55: SYS INFO] Process went down, waiting for 0 seconds [03/24/2017 04:35:26 > 0f5c55: SYS INFO] Status changed to PendingRestart [03/24/2017 04:35:27 > 0f5c55: SYS INFO] Run script '##EXECUTABLENAMEHERE##' with script host - 'WindowsScriptHost' [03/24/2017 04:35:27 > 0f5c55: SYS INFO] Status changed to Running

Packages in use:


  <package id="Microsoft.Azure.WebJobs" version="2.0.0" targetFramework="net461" />
  <package id="Microsoft.Azure.WebJobs.Core" version="2.0.0" targetFramework="net461" />
  <package id="Microsoft.Azure.WebJobs.Extensions" version="2.0.0" targetFramework="net461" />
  <package id="Microsoft.Azure.WebJobs.ServiceBus" version="2.0.0" targetFramework="net461" />

webjob-publish-settings.json

{ "$schema": "http://schemastore.org/schemas/json/webjob-publish-settings.json", "webJobName": "##here is the name of the job##", "runMode": "Continuous" }

settings.job

{ "stopping_wait_time": 3600 }

The job itself it's just a web job with several times and one service bus trigger. I would appreciate if you give me some info about this error code.

PS: so far it usually happens within the same time frame between 3AM - 4AM GTM. Not usre if it has anything to do with the error.

thanks, Vladimir.

mathewc commented 7 years ago

Do your error logs show any details on what caused the crash - stack trace, etc.? We need that to determine where the problem lies.

relusion commented 7 years ago

this is the only unhandled exception I've got so far: It seems that the job's dashboard(AzureWebJobsDashboard) or AzureWebJobsStorage connection does not have retry logic, another one thing is that in my case the dashboard storage account is located in a different region. this is why it might time out.

1 events at level 'Error' or lower have occurred within time window 00:30:00. 03/24/2017 19:21:36 Error Executed '##FUNCTIONNAMEGOESHERE' (Failed, Id=2c3fe0b3-5f50-4362-a8e1-e1de4df3de98) WebJobs.Execution Microsoft.WindowsAzure.Storage.StorageException: The client could not finish the operation within specified timeout. ---> System.TimeoutException: The client could not finish the operation within specified timeout. --- End of inner exception stack trace --- at Microsoft.WindowsAzure.Storage.Core.Util.StorageAsyncResult`1.End() at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.EndUploadFromStream(IAsyncResult asyncResult) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.EndUploadFromByteArray(IAsyncResult asyncResult) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.EndUploadText(IAsyncResult asyncResult) at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>cDisplayClass4.b3(IAsyncResult ar) --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.Azure.WebJobs.Host.Loggers.UpdateOutputLogCommand.d18.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.Azure.WebJobs.Host.Loggers.UpdateOutputLogCommand.d16.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task) at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.d13.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.d10.MoveNext() Request Information RequestID: RequestDate: StatusMessage:

christopheranderson commented 7 years ago

@brettsam - we have some changes coming to this code path, I believe. If we're not already, could we better handle exceptions coming out of the Storage SDK in our UpdateOutputLogCommand method?

dhameliyaharesh commented 7 years ago

I am facing same issue.

[05/26/2017 22:30:14 > 8ea068: ERR ] Unhandled Exception: System.Threading.Tasks.TaskCanceledException: A task was canceled. [05/26/2017 22:30:14 > 8ea068: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [05/26/2017 22:30:14 > 8ea068: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [05/26/2017 22:30:14 > 8ea068: ERR ] at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()

  1. I am running continuous web job, with always on setting.
  2. Web Job function is calling REST API endpoint, which may take around 30 minutes to complete. 3.i'v set stopping_wait_time=3600
NicolajHedeager commented 7 years ago

Hi,

We are having the exact same issue. WebJob crashes randomly 3-4 times a day with exit code -532462766. The only way for us to solve this, is to disable the dashboard logging, but that really limits our monitoring capabilities.

@christopheranderson Can you provide us with an updated milestone for a fix to this bug?

paulbatum commented 7 years ago

@NicolajHedeager Since you've already turned off your dashboard logging, have you tried using the new App Insights integration? This is going to become the default monitoring experience over the next few months. Its much more scalable.

paulbatum commented 7 years ago

Forgot to include the link for using App Insights in webjobs: https://github.com/Azure/azure-webjobs-sdk/wiki/Application-Insights-Integration

NicolajHedeager commented 7 years ago

Thanks for the link @paulbatum Looks promising. We already use App Insights in our web tier so I will definitely give it a try.

AkshathaSP commented 5 years ago

Hi, We are still facing this issue with same error code(-532462766) . Is there a tracking bug for this issue and has it been fixed ?

brettsam commented 5 years ago

@AkshathaSP -- what versions of webjobs are you using?

AkshathaSP commented 5 years ago

Hi @brettsam , We are using version="1.1.2". However we have a triggered type webjob failing with the same error.

ravick4u commented 5 years ago

I am seeing something like below. If you see the first two lines this is happening after every 7 hours.

[07/21/2019 07:05:12 > efe1a8: SYS INFO] WebJob is still running [07/21/2019 14:01:37 > efe1a8: SYS ERR ] Job failed due to exit code -532462766 [07/21/2019 14:01:37 > efe1a8: SYS INFO] Process went down, waiting for 0 seconds [07/21/2019 14:01:37 > efe1a8: SYS INFO] Status changed to PendingRestart [07/21/2019 14:01:37 > efe1a8: SYS INFO] Run script 'MyWebJob.exe' with script host - 'WindowsScriptHost' [07/21/2019 14:01:37 > efe1a8: SYS INFO] Status changed to Running