Closed dylan-kinsa closed 3 years ago
After more tests, when using a CSV as the input, it looks like it treats the first record as the duplicate to be removed. So in this configuration, id-2
would have been kept:
$distinct_id,$properties.$name,$properties.$email,$properties.$last_seen
id-1,undefined,undefined,2021-01-13T04:28:34
id-2,undefined,undefined,2021-01-13T04:28:48
This should probably be documented for the profiles
param
@dylan-kinsa I was not able to reproduce this behavior. I believe you were able to work through this issue with my colleague Sam on the support team, so I'm going to go ahead and close this, but feel free to re-open with a more detailed description of how exactly to reproduce if you continue to experience this problem.
Expected Behavior
Per this docstring, it is expected that
deduplicate_people
will keep the most recently seen record when deduplication, and delete the others:Actual Behavior
The opposite behavior occurs, i.e. the record with the oldest
$last_seen
is preserved, the most recent is deleted. Using the following CSV data:Note that the line item with
$distinct_id
:id-2
should have been preserved, but it's properties were merged intoid-1
and deleted.