Document multi-zone / multi-region setup

jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform

https://www.jaegertracing.io/

Apache License 2.0

20.07k stars 2.39k forks source link

Document multi-zone / multi-region setup #1833

Open naseemkullah opened 4 years ago

naseemkullah commented 4 years ago

Requirement - what kind of business use case are you trying to solve?

Application is spread out across multiple clusters and traffic is routed to a cluster based on client location. How to have one Jaeger UI to see traces in all the clusters?

Is it possible to have collector,db,query and ui all in one cluster where all agents running in multiple clusters send their traces to? And if so is there a way to tag/label the span with cluster/region name?

Problem - what in Jaeger blocks you from solving the requirement?

There is no documentation on what the possible/best patterns are.

yurishkuro commented 4 years ago

I am tagging this as a documentation request.

We don't have good support for multi-region setup yet, unless you're willing to ship all data in a single region.

For multi-zone setup within a single region, it is possible to run agents in all zones and instruct them to forward traces to collectors running elsewhere (we use this configuration at Uber). You can even run collectors in multiple zones, as long as your storage is regional (and ideally replicated across zones to be tolerate of a single zone failure).

naseemkullah commented 4 years ago

Thanks @yurishkuro, could you in the documentation mention any caveats of a multi-region setup where all data is shipped to a single region?

yurishkuro commented 4 years ago

The main caveats are:

bandwidth between regions is usually more expensive than between zones
if all trace data is shipped to a single region and it has an outage then you lose visibility into the other regions

If you architecture generally does not have cross-region calls, then it's better to run separate Jaeger installations per region. That's what we do at Uber. In our case we do have some percentage of requests going cross-region, and we don't yet have a good story for those. In my book I discussed several ways of dealing with that.

naseemkullah commented 4 years ago

Thanks for explaining that @yurishkuro Finally, as for accessing the various UIs, what do you think is better/what is used at Uber?

1. jaeger-us-east.uber.com jaeger-us-west.uber.com

2. jaeger.uber.com/us-east jaeger.uber.com/us-west

yurishkuro commented 4 years ago

I don't think there's much difference, it's more about how your infra is configured.

pavolloffay commented 4 years ago

Should we move this to documentation repository?

yurishkuro commented 4 years ago

I'd keep it here for better discoverability. But no strong opinion.

winshenting commented 3 years ago

I'd keep it here for better discoverability. But no strong opinion.

I have one similar question to ask. We have one elasticsearch database collecting qa and cm data. In qa index-prefix is qa, qa jaeger portal only shows qa data. In cm index-prefix is cm. cm jaeger portal only shows cm data. In order to consolidate portal website, to make sure one portal shows qa and cm data, how to configure it?

jpkrohling commented 3 years ago

The Jaeger Query ("portal", in your message) is not able to use multiple storages at the same time.

teoyaomiqui commented 3 years ago

@jpkrohling I have the same setup as @winshenting have pointed out. What would be the best way to address this? Have separate indexes-prefixes with multiple Jaeger-UIs running. Or is it possible to write everything into a single index and somehow aggregate trace by the cluster (service names are the same in different clusters)? Also, would this address cross cluster traces?

aimuz commented 1 year ago

Is there such a service that enables aggregation of multiple jaeger query data?

I plan to deploy one jaeger per machine in multiple clusters, but with this, I won't be able to see the links of multiple clusters using the jaeger ui.

There will be cross-cluster calls to the service.