apache / cassandra-gocql-driver

GoCQL Driver for Apache Cassandra®
https://cassandra.apache.org/
Apache License 2.0
2.58k stars 620 forks source link

Azure Cosmos DB Cassandra API support #1205

Closed sylr closed 6 years ago

sylr commented 6 years ago

What version of Gocql are you using?

Master

What did you do?

Tried to use https://github.com/jaegertracing/jaeger (which implements gocql) against Azure Cosmos DB with Cassandra API

What did you expect to see?

I've exepected Jaeger to connect to Cosmos DB succesfully.

What did you see instead?

gocql: unable to dial control conn 104.45.X.X: Cql request had unsupported headers Compression
sylr commented 6 years ago

If anyone is interested in giving it a look I can provide a Cosmos DB instance.

alourie commented 6 years ago

@sylr just out of curiosity, did you try connecting to Cosmos DB with pure gocql test program?

sylr commented 6 years ago

Nope, where can I find that ?

beltran commented 6 years ago

@sylr I'd imagine that error would happen if gocql has the ClusterConfig.Compressor set but compression it's not supported on the server side. I don't know if jaeger allows you to tweak this. If it doesn't you probably can open an issue there. This doesn't look at all like a bug with gocql.

Zariel commented 6 years ago

We should probably do an OPTIONS call if the config has compression enabled and only enable it if the backend supports it

sylr commented 6 years ago

@beltran It does indeed

https://github.com/jaegertracing/jaeger/blob/7919cd98d30b858808941d72db31015a0317bd2d/pkg/cassandra/config/config.go#L127

I'll try to remove it to see if things improves

sylr commented 6 years ago

It works "better" if I remove the compressor.

{"level":"info","ts":1539000161.6181421,"caller":"healthcheck/handler.go:99","msg":"Health Check server started","http-port":14269,"status":"unavailable"}
2018/10/08 14:02:48 error: failed to connect to 127.0.0.1:9999 due to error: Keyspace jaeger_v1_test doesn't exist
{"level":"fatal","ts":1539000173.0581703,"caller":"collector/main.go:100","msg":"Failed to init storage factory","error":"no connections were made when creating the session","stacktrace":"main.main.func1\n\t/home/sylvain/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:100\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/sylvain/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:698\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/sylvain/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:783\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/sylvain/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:736\nmain.main\n\t/home/sylvain/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:189\nruntime.main\n\t/usr/lib/go-1.10/src/runtime/proc.go:198"}
beltran commented 6 years ago

@sylr you must be setting cluster.Keyspace = "jaeger_v1_test" but the keyspace doesn't exist.

jhonyam commented 5 years ago

@Zariel were you able to get jaeger connected to Cosmos db? I am trying to, but having no luck

I am able to connect to cosmos db fine with the Quick start steps on Azure portal

export SSL_VERSION=SSLv23
export SSL_VALIDATE=false

cqlsh.py yammer-tracie.cassandra.cosmosdb.azure.com 10350 -u yammer-tracie -p passwordf_from_azure --ssl

I tried build jaeger-query with the latest version of gocql AND also the version with this fix (44e29ed5b8a4b4fff39d7ebaa19976c6f852075b) and had no luck there

Command: ./jaeger-query --query.static-files=jaeger-ui-build/build/ --cassandra.servers=yammer-tracie.cassandra.cosmosdb.azure.com --cassandra.port=10350 --cassandra.username=yammer-tracie --cassandra.password=passsword_from_azure --cassandra.tls.verify-host=false --cassandra.tls=true

Error:

{"level":"info","ts":1541018500.2958531,"caller":"healthcheck/handler.go:99","msg":"Health Check server started","http-port":16687,"status":"unavailable"}
{"level":"fatal","ts":1541018505.7411027,"caller":"query/main.go:105","msg":"Failed to init storage factory","error":"gocql: unable to create session: unable to fetch peer host info: EOF","stacktrace":"main.main.func1\n\t/Users/jhon/go/src/github.com/jaegertracing/jaeger/cmd/query/main.go:105\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/Users/jhon/go/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:698\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/Users/jhon/go/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:783\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/Users/jhon/go/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:736\nmain.main\n\t/Users/jhon/go/src/github.com/jaegertracing/jaeger/cmd/query/main.go:172\nruntime.main\n\t/usr/local/Cellar/go/1.11.1/libexec/src/runtime/proc.go:201"}
jhonyam commented 5 years ago

@sylr is who I should have mentioned

jhonyam commented 5 years ago

We removed this line https://github.com/jaegertracing/jaeger/blob/7919cd98d30b858808941d72db31015a0317bd2d/pkg/cassandra/config/config.go#L127 and built jaeger and also added --cassandra.connections-per-host=1 so that it wouldn't look for peer host info. This allowed us to connect!

lukasmrtvy commented 5 years ago

@jhonyam What exactly did You do?

I am running this image: (removed compressor in patch file) https://github.com/lukasmrtvy/docker-jaeger-cosmosdb/blob/master/Dockerfile

docker run -d \
--name jaeger-collector \
-p 9411:9411 \
-e SPAN_STORAGE_TYPE=cassandra \
-e CASSANDRA_SERVERS=myaccount.cassandra.cosmosdb.azure.com \
-e CASSANDRA_PORT=10350 \
-e CASSANDRA_PASSWORD=mypassword \
-e CASSANDRA_USERNAME=myusername  \
-e CASSANDRA_TLS=true \
-e CASSANDRA_KEYSPACE=jaeger_v1_dc1 \
-e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
-e CASSANDRA_CONNECTIONS_PER_HOST=1 \
-e CASSANDRA_TLS_VERIFY_HOST=false \
jaeger

but still no luck :/

{"level":"fatal","ts":1546610176.0378273,"caller":"collector/main.go:103","msg":"Failed to init storage factory","error":"gocql: unable to create session: control: unable to connect to initial hosts: dial tcp X.X.X.X:10350: i/o timeout","stacktrace":"main.main.func1\n\t/go/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:103\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/go/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/go/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/go/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/go/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:203\nruntime.main\n\t/usr/lib/go-1.10/src/runtime/proc.go:198"}
jmhon08 commented 5 years ago

Mine is like this exec ./jaeger-query --query.static-files=/go/src/jaeger-ui-build/build/ \ --cassandra.servers=$CASSANDRA_SERVERS \ --cassandra.port=10350 \ --cassandra.username=yammer-tracie \ --cassandra.password=$CASSANDRA_JAEGER_PASSWORD \ --cassandra.keyspace=$CASSANDRA_KEYSPACE \ --query.port=10579 \ --cassandra.tls=True \ --cassandra.tls.verify-host=False

So maybe try cassandra.tls.verify-host=False ?

lukasmrtvy commented 5 years ago

@jmhon08 are You able to use Azure Cosmos DB with query and collector, or what? Look https://github.com/jaegertracing/jaeger/issues/1105#issuecomment-452207550

jmhon08 commented 5 years ago

I am able to connect with query. I added a service_name to the table in Cosmos DB and I see it show up in the UI, but I am running into an issue with collector when it tries to insert a row into the traces table

Query (simplified) INSERT INTO traces(trace_id, span_id, span_hash, parent_id, operation_name, flags, start_time, duration, tags, logs, refs, process) VALUES(5f02ea5b9da79dc1 8015088342021266700 419548962199218664 0 get 0 1548878049447389 2308 );

Error SyntaxException: line 1:95 no viable alternative at input 'duration (..., span_id, span_hash, parent_id, operation_name, flags, start_time, duration...)

This is how I had to create the table, which is slightly modified from what you get when you run MODE=test sh ./plugin/storage/cassandra/schema/create.sh

CREATE TABLE IF NOT EXISTS jaeger_v1_test.traces ("trace_id" blob,"span_id" bigint,"span_hash" bigint,"parent_id" bigint,"operation_name" text,"flags" int,"start_time" bigint,"duration" bigint,"tags" list<frozen<keyvalue>>,"logs" list<frozen<log>>,"refs" list<frozen<span_ref>>,"process" frozen<process>,PRIMARY KEY (trace_id, span_id, span_hash)) WITH compaction = {'compaction_window_size': '1','compaction_window_unit': 'HOURS','class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

I don't see why duration is causing trouble here. Yesterday, I asked the folks at Ask Azure Cosmos DB askcosmosdb@microsoft.com if they know what's wrong. Still waiting for a reply.

jmhon08 commented 5 years ago

Here's the full list of my queries after I edited them to work with the Cassandra API. They all run fine as queries, but I may have done something wrong since Jaeger can't write to the traces table OR the Cosmos folks have a bug

drop keyspace jaeger_v1_test;

CREATE KEYSPACE IF NOT EXISTS jaeger_v1_test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}; CREATE TYPE IF NOT EXISTS jaeger_v1_test.keyvalue ("key" text,"value_type" text,"value_string" text,"value_bool" boolean,"value_long" bigint,"value_double" double,"value_binary" blob); CREATE TYPE IF NOT EXISTS jaeger_v1_test.log ("ts" bigint,"fields" list<frozen>); CREATE TYPE IF NOT EXISTS jaeger_v1_test.span_ref ("ref_type" text,"trace_id" blob,"span_id" bigint); CREATE TYPE IF NOT EXISTS jaeger_v1_test.process ("service_name" text,"tags" list<frozen>);

CREATE TABLE IF NOT EXISTS jaeger_v1_test.traces ("trace_id" blob,"span_id" bigint,"span_hash" bigint,"parent_id" bigint,"operation_name" text,"flags" int,"start_time" bigint,"duration" bigint,"tags" list<frozen>,"logs" list<frozen>,"refs" list<frozen>,"process" frozen,PRIMARY KEY (trace_id, span_id, span_hash)) WITH compaction = {'compaction_window_size': '1','compaction_window_unit': 'HOURS','class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

CREATE TABLE IF NOT EXISTS jaeger_v1_test.service_names ("service_name" text,PRIMARY KEY ("service_name")) WITH compaction = {'min_threshold': '4','max_threshold': '32','class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

CREATE TABLE IF NOT EXISTS jaeger_v1_test.operation_names ("service_name" text,"operation_name" text,PRIMARY KEY (("service_name"), "operation_name")) WITH compaction = {'min_threshold': '4','max_threshold': '32','class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

CREATE TABLE IF NOT EXISTS jaeger_v1_test.service_operation_index ("service_name" text,"operation_name" text,"start_time" bigint,"trace_id" blob,PRIMARY KEY (("service_name", "operation_name"), "start_time")) WITH CLUSTERING ORDER BY (start_time DESC) AND compaction = {'compaction_window_size': '1','compaction_window_unit': 'HOURS','class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

CREATE TABLE IF NOT EXISTS jaeger_v1_test.service_name_index ("service_name" text,"bucket" int,"start_time" bigint,"trace_id" blob,PRIMARY KEY (("service_name", "bucket"), "start_time")) WITH CLUSTERING ORDER BY (start_time DESC) AND compaction = {'compaction_window_size': '1','compaction_window_unit': 'HOURS','class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

CREATE TABLE IF NOT EXISTS jaeger_v1_test.duration_index ("service_name" text,"operation_name" text,"bucket" timestamp, "duration" bigint,"start_time" bigint,"trace_id" blob, PRIMARY KEY ((service_name, operation_name, bucket), "duration", start_time, trace_id)) WITH CLUSTERING ORDER BY ("duration" DESC, start_time DESC, trace_id DESC) AND compaction = {'compaction_window_size': '1', 'compaction_window_unit': 'HOURS', 'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

CREATE TABLE IF NOT EXISTS jaeger_v1_test.tag_index (service_name text,tag_key text,tag_value text,start_time bigint,trace_id blob,span_id bigint,PRIMARY KEY ((service_name, tag_key, tag_value), start_time, trace_id, span_id)) WITH CLUSTERING ORDER BY (start_time DESC, trace_id DESC, span_id DESC) AND compaction = {'compaction_window_size': '1','compaction_window_unit': 'HOURS','class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 172800 AND speculative_retry = 'NONE' AND gc_grace_seconds = 0;

CREATE TYPE IF NOT EXISTS jaeger_v1_test.dependency ("parent" text,"child" text,"call_count" bigint);

CREATE TABLE IF NOT EXISTS jaeger_v1_test.dependencies (ts timestamp,ts_index timestamp,dependencies list<frozen>,PRIMARY KEY (ts)) WITH compaction = {'min_threshold': '4','max_threshold': '32','class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND default_time_to_live = 0;

CREATE CUSTOM INDEX ON jaeger_v1_test.dependencies (ts_index) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'SPARSE'};

The last query won't run because the Cosmos DB folks said that creating an index is not available yet.

Zariel commented 5 years ago

This is not related to gocql so am locking this, gocql will work with cosmosDB as long as it implements the Cassandra binary protocol.