PostHog / plugin-repository

Plugins for PostHog
MIT License
2 stars 7 forks source link

Plugin request: Postgres #11

Closed timgl closed 3 years ago

timgl commented 3 years ago

I want to dump all my people and events into a postgres database so I can use metabase to do queries.

mariusandra commented 3 years ago

Was this plugin requested by someone? What's the urgency? Will a posthog -> posthog export be good enough, like we have now with the replicator plugin?

The trouble is, if we expose pg for a plugin to use, we might be opening ourselves up for security issues. The connections will be made from our VPC, so they could theoretically find their way towards sensitive data.

Are there any ways we can get around this without seriously complicating the network setup? Is this something we should be worried about?

CC @macobo @fuziontech

timgl commented 3 years ago

@mariusandra Yeah it was requested by someone using cloud but wanting to do their own analysis on Metabase. Haven't heard from them in a while so probably not top urgency (esp as our focus is shifting to self hosted).

Will a posthog -> posthog export be good enough, like we have now with the replicator plugin?

I think something that works out of the box would be nicer but this could work for now if anyone asks.

fuziontech commented 3 years ago

I would say the best way for them to load postgres would be via loading data from an s3 dump. Having that gap would be more secure and would be less likely to get shoved over from volume.

As for metabase - the best suggestion is for them to spin up a small redshift cluster and have them wrap the data in s3 using that. Loading data to PG is just a bad pattern IMO.

yakkomajuri commented 3 years ago

Ok so I've now ran into this barrier. I’m building a Redshift plugin and was wanting to leverage pg to access it.

There's probably some way to get it done via HTTP for Redshift, but not for any random Postgres instance I'd assume. So it'd be great to be able to use the package.

Of course as a general rule an S3 dump might be best (also could then use COPY instead of INSERT) but I'd love to find ways to make plugins easy to use (i.e. you don't need another service just to export your data).

macobo commented 3 years ago

The trouble is, if we expose pg for a plugin to use, we might be opening ourselves up for security issues. The connections will be made from our VPC, so they could theoretically find their way towards sensitive data.

This already applies for fetch - e.g. clickhouse exposes a HTTP api which can be used for evil things there. Exposing pg does not change that equation.

yakkomajuri commented 3 years ago

Done: https://github.com/PostHog/postgres-plugin

yakkomajuri commented 3 years ago

Ah, well, people aren't dumped yet. Exporting people is something we've been discussing how to do.