jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform
https://www.jaegertracing.io/
Apache License 2.0
20.34k stars 2.43k forks source link

Provide multiple datacenters to Cassandra Schema generation scripts #2618

Open Kerptastic opened 3 years ago

Kerptastic commented 3 years ago

Requirement - what kind of business use case are you trying to solve?

We have a multi-datacenter configuration. Currently, when using the jaeger-cassandra-schema container, the keyspace being generated can only be configured using 1) a single datacenter; 2) a hard-coded replication factor of 2.

Example of current: {'class': 'NetworkTopologyStrategy', 'dc1': '2' }

This results in the following error:

 panic: first replica is not the primary replica for the token: expected 10.120.42.3 got 10.121.4.7

Our Cassandra DBEs propose that the NetworkTopologyStrategy have a minimum replication factor of 3, and that each datacenter be configured during keyspace creation.

Example of requested: {'class': 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3' }

Problem - what in Jaeger blocks you from solving the requirement?

We currently use the Jaeger Operator and the CassandraSchema container to prep and manage our Jaeger deployment. Without this change, we need to fork the repo and build our own CassandraSchema container in order to successfully deploy to a multi-dc Cassandra. This would also require us to internally maintain a fork in order to recreate the containers, if need be.

Proposal - what do you suggest to solve the problem or improve the existing situation?

A suggestion would be to edit the plugin/storage/cassandra/schema/create|docker.sh to allow for comma-delimited DATACENTER value to be provided, and created the appropriate replication_factor value when multiple datacenters are indeed provided.

I have a PR for this ready, and tested against our Cassandra multi-dc configuration, and it works as expected.

Any open questions to address

jpkrohling commented 3 years ago

Sounds sane to me, but would be good to have an opinion on someone else using Cassandra as well. @joe-elliott, are you using it?

joe-elliott commented 3 years ago

We do use Cassandra, but we do not have a multi-dc setup. The script already supports custom replication factor:

https://github.com/jaegertracing/jaeger/blob/master/plugin/storage/cassandra/schema/create.sh#L14

It seems like we just need support for the comma delimited datacenters? I'd be glad to test the changes on our setup once the PR exists.

Kerptastic commented 3 years ago

@joe-elliott The replication factor is supported in the create.sh but doesn't get passed via the docker.sh, nor do I think it is supported as a parameter to the jaeger-operator or helm chart. I was hesitant to go down the rabbit hole on this one, so I updated the default for prod from 2 to 3. PR incoming.

LuChenjing commented 3 years ago

@Kerptastic Hi, may I ask whether this PR been merged? And are you able to provide a example for the multi-dc cassandra, as I didn't see the param of REPLICATION_FACTOR in jaeger_types.go

Kerptastic commented 3 years ago

@Kerptastic Hi, may I ask whether this PR been merged? And are you able to provide a example for the multi-dc cassandra, as I didn't see the param of REPLICATION_FACTOR in jaeger_types.go

No it has not, I totally spaced finishing this actually as we are exploring ES for the backend.

yurishkuro commented 8 months ago

The schema files already support a $replication parameter that can be customized as desired. However, the wrapping script only supports REPLICATION_FACTOR env var that is used to compose $replication: https://github.com/jaegertracing/jaeger/blob/e08f576fd64a992ef0396112bc8401472cc9dd92/plugin/storage/cassandra/schema/create.sh#L41-L54

Proposal: support top-level REPLICATION env var that will supersede the above logic