Give an ID on build failures as well as http 400 + 500 codes

sillsdev / serval

A REST API for natural language processing services

MIT License

4 stars 0 forks source link

Give an ID on build failures as well as http 400 + 500 codes #91

Closed johnml1135 closed 1 year ago

johnml1135 commented 1 year ago

Generate an ID for http 400 and 500 codes - and log the ID, the response and the request endpoint. Then we can correlate SF issues with Serval issues.

johnml1135 commented 1 year ago

This is a partial implementation to #58. It enables something without having to instrument every microservice.

johnml1135 commented 1 year ago

Or a SMT/NMT build failure.

Enkidu93 commented 1 year ago

@johnml1135 Could you clarify what you mean by ID?

johnml1135 commented 1 year ago

A randomized, transaction ID, that would be able to associate the scripture forge errors with our error logs. That is the whole purpose, and it should not live beyond being entered into the log and inserted into the HTTPresponse.

johnml1135 commented 1 year ago

When later we implement a trace ID, the trace ID would replace this temporary ID.

ddaspit commented 1 year ago

We need a way to correlate issues in SF and Serval. Our long-term plan is to instrument all of the microservices in Serval to support distributed tracing using OpenTelemetry. This will allow us to associate requests across all of our services. All of the requests are tied together by a single trace ID. In the meantime, we just want to return a trace ID whenever we return an error status code in the Serval API. The trace ID just needs to be a generated GUID. You can use System.Diagnostics.ActivityTraceId.CreateRandom().ToString() to create the GUID. This will create the same kind of GUID that OpenTelemetry will create.

Enkidu93 commented 1 year ago

I see. OK, that's straightforward. I can go ahead and implement the short-term solution.

johnml1135 commented 1 year ago

The long term issue is #58.

Enkidu93 commented 1 year ago

Is there a reason we can't use the traceIds already provided? I just implemented a simple result filter that logs the status code and the traceId (Activity.Current.Id) which is returned in the http response. @ddaspit @johnml1135

johnml1135 commented 1 year ago

Here is my best guess - @ddaspit what do you think?

https://stackoverflow.com/questions/70828928/difference-between-httpcontext-traceidentifier-and-activity-current-id/77008433#77008433

// Use http first to trace to client call, if available
// If not, fall back on the thread ID
var traceId = httpContext?.TraceIdentifier ?? Activity.Current?.Id;

Enkidu93 commented 1 year ago

But if I use the TraceIdentifier, I will want to set the Activity.Current.Id to it anyway since that's what ultimately gets sent as part of the HTTP response (I've verified this: in this example { "type": "https://tools.ietf.org/html/rfc7231#section-6.5.4", "title": "Not Found", "status": 404, "traceId": "00-56e2b2aa24b8f0c040bc981e0319731a-3317ed93be04a435-00" } the traceId is set to whatever Activity.Current?.Id is regardless of httpContext?.TraceIdentifier).

johnml1135 commented 1 year ago

Ok - so let's see if I understand - aspdotnet automagically includes Activity.Current.Id in http responses. Since this is the case, if we log Activity.Current.Id and tell SF to associate it with their trace, we could get the desired end?

Enkidu93 commented 1 year ago

Yes, that is how it works. I have that working locally, but I wanted to make sure there wasn't a reason that we should generate a separate id and use that.

johnml1135 commented 1 year ago

I like the solution (Activity.Current.Id) - let's go with it.

ddaspit commented 1 year ago

The PR only seems to log the trace ID. Wasn't the point of this issue to return the trace ID in the response, so that SF has access to it?

johnml1135 commented 1 year ago

@Enkidu93 - isn't the trace ID already included in every http response - or am I misunderstanding your previous comments?

Enkidu93 commented 1 year ago

Yes, it is included already on 4-500s.

Enkidu93 commented 1 year ago

Is that sufficient, @ddaspit, to close this issue?

ddaspit commented 1 year ago

Yep, that is great.