PostHog / meta

This is a place to discuss non-product issues in public.
MIT License
18 stars 4 forks source link

HogQL Insights #130

Open thmsobrmlr opened 1 year ago

thmsobrmlr commented 1 year ago

HogQL Insights

Current state of the HogQL conversion for insights and moving from filters-based insights to query-based insights.

What are we doing and why?

We are rewriting all our insights in HogQL, instead of raw ClickHouse SQL, which allows us to implement performance improvements and feature toggles (e.g. PoE modes) on this intermediate layer. This also allows us to expose the query to the end user, so they debug issues themselves or adapt queries to less frequent use cases.

In addition to the changes on the SQL layer we also change the way we store the insight configuration. Currently we have a mixin-based filters format (flat key-value structure) that became hard to maintain and doesn't allow reusage of sub-parts. The new query format (nested json) should allow copy-pasting parts and nesting "sources" in other queries to allow re-using the results throughout PostHog.

High-level plan of remaining steps

  1. Remove all filters from the frontend and use the frontend-side filterToQueryNode function to convert all api responses to the new query format (when fetching) and the queryNodeToFilter function to convert them back to filters for saving/duplicating/etc.
  2. Toggle the feature flag query-based-insights-saving, that then sends insights back with query, instead of filters for saving/duplicating/etc.
  3. Optional: Use the backend side filter_to_query function in the insights serializer to return a query from the backend (just this endpoint, as this is only for testing the filter_to_query function works as expected).
  4. Optional: Any changes we want to make to the query schema e.g. camel casing key that are still snake cased.
  5. If everything works, migrate the insights backend side (iterate over them and use the backend side filter_to_query function to replace filters with query).
  6. Migrate other entities that have insights e.g. activity log or notebooks.
  7. Convert experiments to HogQL.
  8. Convert cohorts to HogQL.
  9. Cleanup.

Remove frontend side dependency on filters

We can get rid of filters frontend side first by using the backend side filter_to_query function to return only queries from insights (and any other places that might return filters) and adapting the frontend so that it only handles queries. For saving insights we can use the frontend side queryNodeToFilters function to send finally send filters to the backend.

After migrating to backend side filters

For insights

For the activity log

For notebooks

Experiments backend

Experiments use the PA code backend side to generate trends/funnel results. We should swap out the legacy implementation for the HogQL one there as well.

Finalize query schema

At some point we want to run a migration to replace filters with queries. After that migration it will be harder to make changes to the query schema, meaning we should clean up the schema as good as we can now.

Unfortunately this got complicated by the fact that notebooks already save insights as queries and not filters. Thus they need additional handling in https://github.com/PostHog/posthog/blob/master/frontend/src/scenes/notebooks/Notebook/migrations/migrate.ts and we need to come up with a way to clean up tech debt there. The queries are stored both in the notebook nodes and in the activity log from which the user can go back in time.

Some proposed changes to the current query schema:

Related bugs

Trends

Funnels

Retention

Cleanup

Make it flippin' amazing

thmsobrmlr commented 8 months ago

Consolidation 1.3.2024

Step 1: Make it work

https://github.com/PostHog/posthog/pull/17295 & https://github.com/PostHog/posthog/pull/17414 & https://github.com/PostHog/posthog/pull/17440

To have a scaffold for other queries and to enable discussion on implementation details, we want to have the lifecycle query ported over to HogQL from frontend to the backend and back i.e.

Step 2. Make it good

Step 3. Make it complete

image

MarconLP commented 7 months ago

related issue: https://posthoghelp.zendesk.com/agent/tickets/12480

MarconLP commented 7 months ago

related issue: https://posthoghelp.zendesk.com/agent/tickets/12609

thmsobrmlr commented 5 months ago

Consolidated post-HogQL-insights points on 2024-04-23 & 2024-06-05

Trends

Known issues with the HogQL implementation of trends:

Known issues with the HogQL implementation of funnels:

Correlation

Lifecycle

Paths