Closed tiina303 closed 3 years ago
Hey @tiina303! 👋 This pull request seems to contain no description. Please add useful context, rationale, and/or any other information that will help make sense of this change now and in the distant Mars-based future.
I rebased this on master, but the tests seem to have been ran (after a retry also on the older branch location).
The original intent was to greatly reduce the amount of data stored, since posthog-js sends... or used to send... a lot of user properties with every request. What will the impact of storage/cost be when enabling this?
Other than that question and/or if this has been considered LGTM.
I asked @fuziontech & he said storage is cheap.
Edit: works fine now after I created a new separate PR which triggered the check here to be rebased properly
For the failing test I don't understand this: in the check I see
> 2020 | expect(await hub.db.fetchDistinctIdValues(person)).toEqual(['distinct_id1'])
| ^
2021 | expect(person.properties).toEqual({ a_prop: 'test-set' })
but the code on this branch is different https://github.com/PostHog/plugin-server/blob/1284a415784dcf602f20650a2611019ad92b6226/tests/shared/process-event.ts#L2020 (i.e. somehow it's using where the branch originally was)
Changes
Currently we store the user properties we changed only on the events in ClickHouse.
If we store all the properties that were sent (set) with the event we'll have a much easier time debugging if we messed up somehow as we'll have all the data to be able to recompute things.
I believe this will also make it easier for users to understand what's happening. We store events in order of timestamps in the events page, but we might ingest them out of order. Imagine two events set a certain value, but we ingested the older event later, we would only see the set on the newer event as that's what was ingested first and changed the value.
This change was originally added in https://github.com/PostHog/plugin-server/pull/558
Checklist