open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.72k stars 2.15k forks source link

[exporter/file] Add posibility to write telemetry in Parquet or Delta format #33807

Open marcinsiennicki95 opened 3 days ago

marcinsiennicki95 commented 3 days ago

Component(s)

exporter/file

Is your feature request related to a problem? Please describe.

Parquet Format: Parquet is a columnar storage file format optimized for big data processing frameworks. It provides efficient data compression and encoding schemes, enhancing performance and reducing storage costs. Telemetry data written in Parquet format is stored in columns, making it faster to read and query specific fields.

Delta Format: Delta Lake is an open-source storage layer that brings ACID transactions to big data workloads. Delta format combines the reliability of data lakes with the performance of data warehouses. Writing telemetry data in Delta format allows for scalable and reliable data processing, supporting complex data pipelines and real-time analytics.

Describe the solution you'd like

Ability to write in Parquet or Delta format

Describe alternatives you've considered

No response

Additional context

No response

github-actions[bot] commented 3 days ago

Pinging code owners:

marcinsiennicki95 commented 3 days ago

@jmacd Is it possible with current stat of arrow, because I found in documentation.

https://github.com/open-telemetry/otel-arrow

  1. Output OpenTelemetry data to the Parquet file format, part of the Apache Arrow ecosystem