open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.01k stars 2.33k forks source link

New component: Cassandra Exporter #17910

Closed emreyalvac closed 1 year ago

emreyalvac commented 1 year ago

The purpose and use-cases of the new component

The purpose of this exporter is to extract traces and logs to Cassandra database.

I already started to develop this component here: https://github.com/emreyalvac/opentelemetry-collector-contrib/tree/cassandra-exporter-implementation.

https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18515

Example configuration for the component

exporters:
  cassandra:
    dsn: 127.0.0.1
    keyspace: "otel"
    trace_table: "otel_spans"
    logs_table: "otel_logs"

service:
  pipelines:
    traces:
      exporters: [ cassandra ]
    logs:
      receivers: [ otlp ]
      exporters: [ cassandra ]

Telemetry data types supported

traces, logs

Is this a vendor-specific component?

Sponsor (optional)

No response

Additional context

No response

atoulme commented 1 year ago

Can you explain a bit more the use case? Is there a standard data format in which the traces will be stored?

emreyalvac commented 1 year ago

Hi @atoulme,

My thought is that write speed is very important for Open Telemetry.

Cassandra is defined as an open-source NoSQL data storage system that leverages a distributed architecture to enable high availability, scalability, and reliability, managed by the Apache non-profit organization.

Cassandra, so fast for write operations and very compatible for analytics data. Also, it's support storing time series data thats why you can calculate throughput, response time and apdex etc.. (time series)

Cassandra’s three data modeling ‘dogmas’:

Disk space is cheap.
Writes are cheap.
Network communication is expensive.

Example Span data on Cassandra database:

[
  {
    "traceid": "104077629213055e8523102a57c659cd",
    "duration": 75957000,
    "events": null,
    "links": null,
    "parentspanid": "",
    "resourceattributes": {
      "service.name": "unknown_service:dotnet"
    },
    "servicename": "unknown_service:dotnet",
    "spanattributes": {
      "http.flavor": "1.1",
      "http.host": "localhost:5000",
      "http.method": "GET",
      "http.scheme": "http",
      "http.status_code": "200",
      "http.target": "/swagger/v1/swagger.json",
      "http.url": "http://localhost:5000/swagger/v1/swagger.json",
      "http.user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"
    },
    "spanid": "c123d6dae1744ce3",
    "spankind": "SPAN_KIND_SERVER",
    "spanname": "/swagger/v1/swagger.json",
    "statuscode": "STATUS_CODE_UNSET",
    "statusmessage": "",
    "timestamp": "2023-01-22",
    "tracestate": ""
  }
]
atoulme commented 1 year ago

So is it stored as a cql table? What is the schema used?

atoulme commented 1 year ago

I have found those in your impl:


const (
    // language=SQL
    createDatabaseSQL = `CREATE KEYSPACE %s with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };`
    // language=SQL
    createEventTypeSql = `CREATE TYPE IF NOT EXISTS %s.Events (Timestamp Date, Name text, Attributes map<text, text>);`
    // language=SQL
    createLinksTypeSql = `CREATE TYPE IF NOT EXISTS %s.Links (TraceId text, SpanId text, TraceState text, Attributes map<text, text>);`
    // language=SQL
    createSpanTableSQL = `CREATE TABLE IF NOT EXISTS %s.%s (TimeStamp DATE,TraceId text, SpanId text, ParentSpanId text, TraceState text, SpanName text, SpanKind text, ServiceName text, ResourceAttributes map<text, text>, SpanAttributes map<text, text>, Duration int,StatusCode text,StatusMessage text, Events frozen<Events>, Links frozen<Links>, PRIMARY KEY (TraceId));`
)

That is intriguing. I'd like to see if you have considered looking into how to work on this with a cluster (I see replication factor set to 1) and particularly if you have a partition key strategy for this.

emreyalvac commented 1 year ago

Hi @atoulme,

Thanks for your time and review. I appreciate it.

Yes, it's storing in Cassandra tables. I improved config structure to change replication and compression dynamically. Also i changed PRIMARY KEY to SpanId. (PRIMARY KEY also defines the PARTITION KEY) Maybe we can create COMPOSE PARTITION KEY between ServiceName and SpanId.

Compression Types

https://cassandra.apache.org/doc/latest/cassandra/operating/compression.html

Replication:

CREATE KEYSPACE otel WITH replication = {‘class’: ‘SimpleStrategy’, ‘replication_factor’: 3};

image

In the above example, we created a keyspace called otel using SimpleStrategy with replication factor 3. The data inserted in this keyspace will be replicated to the three nodes, in one datacenter and across different racks.

When i run Cassandra exporter with following config, schema will be like this:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
exporters:
  cassandra:
    dsn: 127.0.0.1
    keyspace: "otel"
    trace_table: "otel_spans"
    replication:
      class: "SimpleStrategy"
      replication_factor: 1
    compression:
      algorithm: "ZstdCompressor"

service:
  pipelines:
    traces:
      receivers: [ otlp ]
      exporters: [ cassandra ]

Schema:

otel: schema durable_writes: true replication: {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'}
    + object-types
        events: object-type
            + object-attributes
                timestamp: date
                name: text
                attributes: map<text, text>
        links: object-type
            + object-attributes
                traceid: text
                spanid: text
                tracestate: text
                attributes: map<text, text>
    + tables
        otel_spans: table compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.ZstdCompressor'}
            + columns
                traceid: text
                duration: int
                events: frozen<events>
                links: frozen<links>
                parentspanid: text
                resourceattributes: map<text, text>
                servicename: text
                spanattributes: map<text, text>
                spanid: text
                spankind: text
                spanname: text
                statuscode: text
                statusmessage: text
                timestamp: date
                tracestate: text
            + keys
                primary key: (spanid)

Default config:

{
        DSN:        "127.0.0.1",
        Keyspace:   "otel",
        TraceTable: "otel_spans",
        Replication: Replication{
            Class:             "SimpleStrategy",
            ReplicationFactor: 1,
        },
        Compression: Compression{
            Algorithm: "LZ4Compressor",
        },
    }
atoulme commented 1 year ago

That’s great! Please look for a sponsor to land this. I cannot sponsor fwiw. Come to a SIG meeting if possible to present your work.

mx-psi commented 1 year ago

@atoulme now that you can, would you be interested in sponsoring this component?

emreyalvac commented 1 year ago

Now also supports Logs.

atoulme commented 1 year ago

I will sponsor.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

emreyalvac commented 1 year ago

Done. https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18515