Closed timcapps closed 5 years ago
Hi Timothy, and thanks for the detailed report,
You should definitely ALWAYS see the "Error getting commitments"
log events when an exception occur in you "get result" task.
Just to be sure it's not an async
/await
usage problem, can you confirm that you are properly await
ing the task that gives you the result (i.e. you have something like var result = await DoSomethingAsync()
in your try
block) ? This would explain the very short duration as the called task would not really be executing in that case...
Also, could you tell us which version of Web API you are using ? and which version of SerilogWeb.Classic
and SerilogWeb.Classic.WebApi
?
Is there an Exception
attached to the events logged by the ApplicationLifeCycle module ?
Yes, we are awaiting out data access call as such:
[HttpPost]
public async Task<HttpResponseMessage> GetCommitmentsAsync(CommitmentSearchCriteria criteria)
{
try
{
Log.Information("GetCommitmentsAsync was called with {@criteria}", criteria);
var result = await OrderProvider.GetCommitmentsAsync(criteria, CurrentUser.Username);
return CreateResponse(result);
}
catch (Exception ex)
{
Log.Error(ex, "Error getting commitments");
return CreateErrorResponse(ex);
}
}
OrderProvider.GetCommitmentsAsync contains the actual data access code.
This is a strange one :-/
Do you happen to have any specific IIS error handling configured in your web.config
file ? (like customErrors
, httpErrors
and things like that) ?
Is it an option for you to completely remove the try/catch and let the exception bubble up (and be caught by SerilogWeb.Classic.WebApi) ? Do you possibly have a global Web API ExceptionLogger
registered ? Do you see IIS errors in the Windows Event Viewer ?
Would it be possible for you to upload the codebase somewhere, or even better a minimal project where you can reproduce this issue ?
Also, there is no Exception attached to the events logged by the ApplicationLifeCycle module
this one is normal, because the exception is caught earlier and is no longer visible. The ApplicationLifeCycle
still logs it at level Error because of the 500 status code, but has no exception to attach to the log event.
Hi @tsimbalar , I checked our web.config file (and the transform files for production) and the customErrors section is commented out so I don't think that's interfering. We don't have an ExceptionLogger registered and we do not see any corresponding IIS error logs for the 500 level errors the ApplicationLifecycleModule is logging in this case.
I will try to put together a sample project that recreates the behavior and check back in with you then.
Thank you for your help!
Hi @timcapps , it's been a while, could you eventually solve your issue ?
@tsimbalar We identified one potential cause. We had a UI component which would listen to an event stream and call the GetCommitmentsAsync Web API method each time a specific event was received. Because the event stream lacked sufficient constraints we were making several (a dozen or so) calls to the Web API method in under a second. We addressed that issue and the events (from the GetCommitmentsAsync method) with missing exception data stopped coming in. However, we have other events which seem to also be missing exception data. We're not sure if it is the same root cause or not.
Could there me some kind of a timing issue?
Hi Timothy,
FYI, we have just published a new version of SerilogWeb.Classic.WebApi
, which fixes some weird and subtle modifications of the error handling process that was happening for Web API.
I would recommend that you upgrade to see if you see any improvements ;)
In our Web API controllers we have the following:
What we expect to see (and do see in most cases) is two logs
However we have a few methods in which it seams the logging statement in the catch block is never hitting. We see the logs from the applicationlifecycle module and that they failed very quickly (often 0 or 1ms as in the screenshot below) However, we do not see the corresponding log that should be in generated in the catch block.
Can you provide any input on what the root cause of this may be? Is there some request type that could result in a 500 level error that we should be filtering out?