PostHog / posthog-go

Official PostHog Go library
MIT License
20 stars 17 forks source link

CloudQuery Source Plugin? #17

Open yevgenypats opened 1 year ago

yevgenypats commented 1 year ago

Hi Team, hopefully this is right place to ask, if not, I'd appreciate if you can direct me.

I'm the founder of cloudquery.io, a high performance open source ELT framework.

Our users are interested in an PostHog plugin, but as we cannot maintain all the plugins ourselves, I was curious if this would be an interesting collaboration, where we would help implement an initial source plugin, and you will help maintain it.

This will give your users the ability to sync PostHog data to any of their datalakes/data-warehouses/databases easily using any of the growing list of CQ destination plugins.

Best, Yevgeny

lharries commented 1 year ago

Hi Yevgeny, thanks for the message.

Given the large number of events our users have, our architecture works better if we have a PostHog app that streams data to your endpoint rather than your plugins pulling data from the API (we have rate limits on the API for this reason).

Do you have a webhook endpoint that clients could send events to with a secret key? If so, it should be straightforward to fork this app and connect our systems: https://github.com/PostHog/posthog-patterns-app/blob/main/index.ts

For reference, we'll continue to build out more and more of our own exports too and so this wouldn't be a replacement for our existing apps like the snowflake exporter but instead giving customers another option

yevgenypats commented 1 year ago

Got it. That makes total sense, and I see now that CloudQuery is less of a fit as a destination plugin on it's own but rather you can just send it to something like Kafka and CloudQuery will take from there to other destinations.

I took a deeper look at the Posthog server plugin as was curious how this works and definitely saw quite a lot of similarities in the areas what we build (on the plugin server side as CloudQuery is just an ELT framework and not an analytics framework). I guess it's not something you are probably planning but just throwing an idea over there and if this will ever be interesting - You could actually replace the plugin server with CloudQuery and it will save you quite a lot of work :) and should be even more performant (in theory as it's Go) and have all the save horizontal scalable capabilities and lot's of destination plugins out of the box.

If this will ever be interesting to explore I'd love to help with that (i.e help with migration or prioritize sources or destination that we don't support yet).

lharries commented 1 year ago

Will checkout the repo, thanks for the suggestion :)

yevgenypats commented 1 year ago

Nice! If interesting let me know and maybe I can come up with a small PoC.