open-telemetry / opentelemetry-demo

This repository contains the OpenTelemetry Astronomy Shop, a microservice-based distributed system intended to illustrate the implementation of OpenTelemetry in a near real-world environment.
https://opentelemetry.io/docs/demo/
Apache License 2.0
1.7k stars 1.07k forks source link

Feature Flag `productCatalogFailure` raising errors on Recommendation Service #1710

Open julianocosta89 opened 2 weeks ago

julianocosta89 commented 2 weeks ago

Bug Report

Which version of the demo you are using? 52d315aeba7f6da80268eaf84613f6bf94e1f341

Symptom

What is the expected behavior? When enabling the feature flag productCatalogFailure it is expected to have an error generated for GetProduct requests with product ID: OLJCESPC7Z.

image

What is the actual behavior? Currently the error is also being trigged in requests for recommendation service whenever the product OLJCESPC7Z is returned in the recommendation list.

image

Reproduce

  1. Run the demo
  2. Enable the feature flag productCatalogFailure
  3. Navigate to http://localhost:8080/loadgen/and check the tab Failures.
julianocosta89 commented 2 weeks ago

The more I think about this the more I believe this is working as expected. The recommendation and the checkout service get the product as well, so when they call GetProduct for product OLJCESPC7Z, the request fails.

This is a great example on how one error can impact multiple services and multiple endpoints.

Not sure if we should work this around for demo purposes.

@reese-lee I'd love to hear your opinion on this.

julianocosta89 commented 1 week ago

@reese-lee we have discussed that in the SIG meeting and it seems that this is expected. The frontend does call the productCatalog once it get the list of products from recommendation service.

https://github.com/open-telemetry/opentelemetry-demo/blob/main/src/frontend/pages/api/recommendations.ts#L21

reese-lee commented 5 days ago

@julianocosta89 I see, I'm trying to understand why recommendationservice isn't impacted by the issue. So, recommendationservice doesn't generate an error, because it's generating the list of products (including the problem product ID) that is then supplied to productcatalogservice when requested by frontend, and since the error doesn't occur until productcatalogservice attempts to give it to frontend, there's no issue for recommendationservice?

Edited to clarify: the span generated by recommendationservice when it executes get_product_list does not result in an error when returning a product list that contains product ID OLJCESPC7Z; the spans generated by frontend, checkoutservice, and productcatalogservice that contain that product ID do.