PostHog / posthog

🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
https://posthog.com
Other
20.56k stars 1.22k forks source link

Person profiles not being merged, using recent versions of the JS sdk #23690

Open robbie-c opened 1 month ago

robbie-c commented 1 month ago

Bug Description

There seems to be an issue with person profile creation or merging when using recent versions of the JS SDK, and so when identify() is called, the events before and after the call get associated with different person profiles.

This has sometimes been reported as customers having many users with an initial referring domain as their own site, or a total increase in the number of users.

We're experiencing this on our own site: https://posthog.slack.com/archives/C02E3BKC78F/p1719000026425859 It has been reported by some customers: https://posthoghelp.zendesk.com/agent/tickets/14940 Some github issues:

@zlwaterfield provided some an example of 2 person profiles that should just be 1 person: https://us.posthog.com/project/2/person/b6MEa8yz1jDuKhs6WIx8ITzn5H0OaGkDb0AHfgMeHZ4#activeTab=events https://us.posthog.com/project/2/person/sasivardhangaddam22%40gmail.com#activeTab=events

Tagging some other people who've been involved: @thmsobrmlr @pauldambra @tiina303 @zlwaterfield @raquelmsmith

robbie-c commented 1 month ago

@tiina303 do you know why those users weren't merged?

raquelmsmith commented 1 month ago

Fixing this should be priority please @tiina303

tiina303 commented 1 month ago

For our case this is because we try to merge the same anonymous distinct_id twice. This is not allowed, as the first time we merge it a person is created and the id cannot be used as anonymous id anymore as it now is tied to an identified person. This is blocked and has been for a long time to avoid lots users accidentally being merged together.

see https://metabase.prod-us.posthog.dev/question/860-why-is-person-merge-blocked This is an implementation error in the usage of posthog SDK - identify shouldn't be used this way.

the way we are using it (which ever call is done later gets blocked):

$identify(email, $anon_distinct_id = anon_id)
$identify(db_id, $anon_distinct_id = anon_id)

How this should be done (any order of processing these will work, always merging into the same ID):

$identify(email, $anon_distinct_id = anon_id)
alias(email, db_id)

OR

$identify(db_id, $anon_distinct_id = anon_id)
alias(db_id, email)
tiina303 commented 1 month ago

For the ZD ticket I don't see how these are related, there just isn't any merge events at all between the first and second person https://metabase.prod-eu.posthog.dev/question/292-why-users-didnt-get-merged

robbie-c commented 1 month ago

I believe https://github.com/PostHog/posthog-js/pull/1314 should fix https://github.com/PostHog/posthog-js/issues/1159

This is good news, as it means that this is not a blocker to the personless events launch. Of the 3 issues raised, 1 is a posthog-js bug with a fix, and 2 are likely problems with SDK usage (though let's fix our one).

raquelmsmith commented 1 month ago

Cool @robbie-c are you going to fix the SDK implementation issue for us?