Open thmsobrmlr opened 1 year ago
https://github.com/PostHog/posthog/pull/17295 & https://github.com/PostHog/posthog/pull/17414 & https://github.com/PostHog/posthog/pull/17440
To have a scaffold for other queries and to enable discussion on implementation details, we want to have the lifecycle query ported over to HogQL from frontend to the backend and back i.e.
InsightActorsQueryOptions
node that works similar to InsightActorsQuery
, but just returns the different fields and their options for the frontend actors modal)PersonsQuery
the default view for persons pages. (Waiting for 2024 before rollout)to_persons_query
related issue: https://posthoghelp.zendesk.com/agent/tickets/12480
related issue: https://posthoghelp.zendesk.com/agent/tickets/12609
Known issues with the HogQL implementation of trends:
Known issues with the HogQL implementation of funnels:
There is no supertype for types Bool, String because some of them are String/FixedString and some of them are not
(see comparision data)Cannot convert string true to type Float64
(see comparision data)
HogQL Insights
Current state of the HogQL conversion for insights and moving from
filters
-based insights toquery
-based insights.What are we doing and why?
We are rewriting all our insights in HogQL, instead of raw ClickHouse SQL, which allows us to implement performance improvements and feature toggles (e.g. PoE modes) on this intermediate layer. This also allows us to expose the query to the end user, so they debug issues themselves or adapt queries to less frequent use cases.
In addition to the changes on the SQL layer we also change the way we store the insight configuration. Currently we have a mixin-based
filters
format (flat key-value structure) that became hard to maintain and doesn't allow reusage of sub-parts. The newquery
format (nested json) should allow copy-pasting parts and nesting "sources" in other queries to allow re-using the results throughout PostHog.High-level plan of remaining steps
filters
from the frontend and use the frontend-sidefilterToQueryNode
function to convert all api responses to the new query format (when fetching) and thequeryNodeToFilter
function to convert them back to filters for saving/duplicating/etc.query-based-insights-saving
, that then sends insights back withquery
, instead offilters
for saving/duplicating/etc.filter_to_query
function in the insights serializer to return aquery
from the backend (just this endpoint, as this is only for testing thefilter_to_query
function works as expected).filter_to_query
function to replacefilters
withquery
).Remove frontend side dependency on filters
We can get rid of filters frontend side first by using the backend side
filter_to_query
function to return only queries from insights (and any other places that might return filters) and adapting the frontend so that it only handles queries. For saving insights we can use the frontend sidequeryNodeToFilters
function to send finally send filters to the backend.filter_to_query
function to return only queries, not filters https://github.com/PostHog/posthog/pull/21945ActionFilter
andentityFilterLogic
based on series, not actions and eventsfrontend/src/scenes/insights/sharedUtils.ts
e.g.isTrendsFilter
orisFilterWithDisplay
filtersToQueryNode
,queryNodeToFilters
, etc. in as many places as possiblecleanFilter
functiongetQueryBasedInsightModel
After migrating to backend side filters
For insights
InsightModel
in subscriptionsLogic.test.tsInsightModel
in funnelDataLogic.test.tsInsightModel
in insightVizDataLogic.test.tsInsightModel
in trendsDataLogic.test.tsFor the activity log
For notebooks
Experiments backend
Experiments use the PA code backend side to generate trends/funnel results. We should swap out the legacy implementation for the HogQL one there as well.
Finalize query schema
At some point we want to run a migration to replace filters with queries. After that migration it will be harder to make changes to the query schema, meaning we should clean up the schema as good as we can now.
Unfortunately this got complicated by the fact that notebooks already save insights as queries and not filters. Thus they need additional handling in https://github.com/PostHog/posthog/blob/master/frontend/src/scenes/notebooks/Notebook/migrations/migrate.ts and we need to come up with a way to clean up tech debt there. The queries are stored both in the notebook nodes and in the activity log from which the user can go back in time.
Some proposed changes to the current query schema:
dateRange
propertiesdate_from
anddate_to
breakdownFilter
propertieshidden_legend_indexes
/hidden_legend_keys
Currently the legend items can be hidden by index or by key depending on the insight and a bug prevents the hidden entries from being saved. We should agree on a single way to hide the entries and fix saving them with the query.aggregation_group_type_index
breakdown_histogram_bin_count
in trends filter propertiesRelated bugs
Trends
Funnels
Retention
Cleanup
Make it flippin' amazing