PostHog / posthog

🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
https://posthog.com
Other
21.23k stars 1.26k forks source link

Create an `events` table to store unparsed events from Kafka #23450

Closed tiina303 closed 2 months ago

tiina303 commented 3 months ago

Please avoid using debug either as a prefix or suffix! 🏄

This table must store raw events from Kafka, without parsing the payload. Just keep it as it is.

Additional metadata that should be stored as well to allow monitoring properly the Kafka consumption is:

Lastly, we need to set it up with a TTL of 14 days, no deduplication engine, and without skipping any malformed events.

Optionally, we can enable the stream error mode for Kafka tables and store the error and raw_message columns as well.

fuziontech commented 2 months ago

We need to research how to efficiently keep an s3 table in clickhouse

fuziontech commented 2 months ago

This has been resolved - we have this writing to S3 and have a table in Athena to hit this