org.springframework.boot' version '3.3.2'
spring-graphql:1.3.2
According to this doc: https://docs.spring.io/spring-graphql/reference/observability.html#observability.server.request, we should get graphql.outcome INTERNAL_ERROR if no valid GraphQL response could be produced. It's not really specified what exactly that means. When I first read the docu I understood that's when there is unchecked exception thrown and not being handled by exception handler.
from the code I see that's actually in the case when there is general throwable error set in observable or there are no data to produce.
Main "throwable error" from ExecutionRequestObservationContext never gets set in case there was an unchecked exception thrown from application code. In that scenario the error is handled via ExceptionResolversExceptionHandler and translated to 'regular error' from executionResult:
ExecutionResultImpl{errors=[INTERNAL_ERROR for 2ee7db16-1015-ae28-d84e-768be03a8e6b], data={calculateProductionCost=null}, dataPresent=true, extensions=null}
graphql.request.count metric which has two tags: graphql.outcome either SUCCESS or REQUEST_ERROR, which is consisting of all possible error responses, even for internal server errors. OUTCOME_INTERNAL_ERROR Is never really reported in our case.
_graphql.request.INTERNAL_SERVERERROR metric reported as count for all INTERNAL_SERVER_ERRORs.
I'm finding this really confusing. It would be much more useful to produce INTERNAL_ERROR observation outcome in case the actual server internal error (aka uncatched exception, not covered by user-defined exception handlers) happens.
It would be also more inline with HTTP world and status codes (I know gql always respond with http 200, it's just an inspiration for metric outcomes):
2xx codes -> outcome success
4xx codes -> outcome request error
5xx codes -> outcome internal/server error
If that's not the desired behaviour, I'd make it very explicit in documentation as current description is IMHO misleading.
EDIT://
for now, we found a workaround by doing this:
@Bean
public ExecutionRequestObservationConvention executionConvention() {
return new DefaultExecutionRequestObservationConvention() {
@Override
@NonNull
protected KeyValue outcome(@NonNull ExecutionRequestObservationContext context) {
final var executionResult = context.getExecutionResult();
final var errors = executionResult != null ? executionResult.getErrors() : null;
final boolean isInternalServerError =
errors != null
&& context.getExecutionResult().getErrors().stream()
.anyMatch(error -> error.getErrorType().equals(ErrorType.INTERNAL_ERROR));
if (isInternalServerError) {
return KeyValue.of(ExecutionRequestLowCardinalityKeyNames.OUTCOME, "INTERNAL_ERROR");
}
return super.outcome(context);
}
};
}
Thanks for this report @lukasGemela - this is now fixed in 1.2.x and 1.3.x SNAPSHOTs. I have also clarified the documentation and that change should be live soon here.
Project details: Spring Boot project, having:
According to this doc: https://docs.spring.io/spring-graphql/reference/observability.html#observability.server.request, we should get graphql.outcome INTERNAL_ERROR if no valid GraphQL response could be produced. It's not really specified what exactly that means. When I first read the docu I understood that's when there is unchecked exception thrown and not being handled by exception handler.
from the code I see that's actually in the case when there is general throwable error set in observable or there are no data to produce.
Main "throwable error" from ExecutionRequestObservationContext never gets set in case there was an unchecked exception thrown from application code. In that scenario the error is handled via ExceptionResolversExceptionHandler and translated to 'regular error' from executionResult:
ExecutionResultImpl{errors=[INTERNAL_ERROR for 2ee7db16-1015-ae28-d84e-768be03a8e6b], data={calculateProductionCost=null}, dataPresent=true, extensions=null}
Though it is reported as part of GraphQlObservationInstrumentation as "graphql.request.INTERNAL_SERVER_ERROR" counter metric.
As a result we can see this behaviour in datadog:
graphql.request.count metric which has two tags: graphql.outcome either SUCCESS or REQUEST_ERROR, which is consisting of all possible error responses, even for internal server errors. OUTCOME_INTERNAL_ERROR Is never really reported in our case.
_graphql.request.INTERNAL_SERVERERROR metric reported as count for all INTERNAL_SERVER_ERRORs.
I'm finding this really confusing. It would be much more useful to produce INTERNAL_ERROR observation outcome in case the actual server internal error (aka uncatched exception, not covered by user-defined exception handlers) happens.
It would be also more inline with HTTP world and status codes (I know gql always respond with http 200, it's just an inspiration for metric outcomes): 2xx codes -> outcome success 4xx codes -> outcome request error 5xx codes -> outcome internal/server error
If that's not the desired behaviour, I'd make it very explicit in documentation as current description is IMHO misleading.
EDIT://
for now, we found a workaround by doing this: