paradedb / paradedb

Postgres for Search and Analytics
https://paradedb.com
GNU Affero General Public License v3.0
6.27k stars 189 forks source link

Add support for ingesting OTEL data in ParadeDB #1632

Open philippemnoel opened 2 months ago

philippemnoel commented 2 months ago

What feature are you requesting?

As users start ingesting logs, they will need support for an OpenTelemetry sink in ParadeDB. Most tools like OTEL, Vector, FluentBit, etc. don't have Postgres as a sink since it was previously not used for this type of workload. The community might help us here, and I am opening up this as a tracking issue for it. This work will need to be done outside of the ParadeDB repository.

Why are you requesting this feature?

Add support for ingesting OpenTelemetry data in ParadeDB.

What is your proposed implementation for this feature?

Add an integration in @open-telemetry, Vector, FluentBit, etc.

Full Name:

Philippe Noël

Affiliation:

ParadeDB

destrex271 commented 2 months ago

I'd like to pick this up

philippemnoel commented 2 months ago

I'd like to pick this up

This is a big ticket. Perhaps we can start with a specific integration? A paradeDB integration for the open-telemetry collector would be a great place to start.

destrex271 commented 2 months ago

Sounds good! We can start out with otel, I'll raise a separate ticket referring this one.

This work will need to be done outside of the ParadeDB repository.

Can we create a separate repo within the paradedb org for this?

philippemnoel commented 2 months ago

Sounds good! We can start out with otel, I'll raise a separate ticket referring this one.

This work will need to be done outside of the ParadeDB repository.

Can we create a separate repo within the paradedb org for this?

We can, but I think this work would need to be done within the opentelemetry repository, no?

EDIT: Never mind, doesn't seem so. Okay.

philippemnoel commented 2 months ago

Here you go: https://github.com/paradedb/paradedb-otel

philippemnoel commented 2 months ago

Actually, there seems to be a project for a Postgres receiver already: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/postgresqlreceiver/README.md

This is what we should use

destrex271 commented 2 months ago

Actually, there seems to be a project for a Postgres receiver already: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/postgresqlreceiver/README.md

This is what we should use

Cool, I'll get started with using it and try to set it up for paradedb. (Maybe we won't need a new repo for this? Let's see!)

philippemnoel commented 2 months ago

Actually, there seems to be a project for a Postgres receiver already: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/postgresqlreceiver/README.md This is what we should use

Cool, I'll get started with using it and try to set it up for paradedb. (Maybe we won't need a new repo for this? Let's see!)

Yeah, we may not! If we don't, perhaps simply documentation is enough here :) Or maybe we need to add something to our Dockerfile?

Let us know what you find in this issue thread and thank you 🙏 for your help here!

philippemnoel commented 2 months ago

Here is a tracking issue I found for a Postgres sink in Vector: https://github.com/vectordotdev/vector/issues/15765 so that also seems promising

philippemnoel commented 2 months ago

FluentBit reference: https://docs.fluentbit.io/manual/pipeline/outputs/postgresql

destrex271 commented 2 months ago

I think the open telemetry receiver is for monitoring Postgres instances. On the other hand the Fluent implementation is for using Postgres as a sink.

Correct me if I am wrong but we are aiming to use ParadeDB as a sink to ingest metrics, right?

philippemnoel commented 2 months ago

I think the open telemetry receiver is for monitoring Postgres instances. On the other hand the Fluent implementation is for using Postgres as a sink.

Correct me if I am wrong but we are aiming to use ParadeDB as a sink to ingest metrics, right?

Yes, correct. You're completely right, I had the wrong understanding of the Postgres link I shared. The goal is for ParadeDB to be able to ingest data from OpenTelemetry. Perhaps we do need a new repo and an implementation from scratch

We only need a Postgres sink. Any regular Postgres support should be fine for ParadeDB to be supported as well.

destrex271 commented 2 months ago

Cool! Lets divide these tasks -

destrex271 commented 2 months ago

Hi!

I was going through OTEL exporters in the opentelemetry-collector-contrib, and I think it will be a good idea if we raise a PR for a ParadeDB/Postgres exporter here: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter

If this seems like a good approach, I'll start by raising a issue and a corresponding PR there and then we can update #1695 accordingly.

philippemnoel commented 2 months ago

Hi!

I was going through OTEL exporters in the opentelemetry-collector-contrib, and I think it will be a good idea if we raise a PR for a ParadeDB/Postgres exporter here: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter

If this seems like a good approach, I'll start by raising a issue and a corresponding PR there and then we can update #1695 accordingly.

That sounds WONDERFUL. Thank you for doing this!