tiki-archive / pool-kgraph

TIKI's Anonymous Contextual Knowledge Graph
MIT License
0 stars 1 forks source link

Determine unique occurrence ID pattern for emails #12

Closed sfaria27 closed 2 years ago

sfaria27 commented 2 years ago

Determine identification pattern for unique occurrences of emails when adding vertices to kgraph

Acceptance criteria:

  1. when multiple users get email, it should show up as single vertex
  2. when singular user gets email, it should show up as as a single vertex
  3. when multiple users get singular emails, it should show up as multiple vertices
mike-audi commented 2 years ago

~Use the following parameters for unique occurrence detection:~

~From (sender email)~ ~Received Date (use 72 hour window)~ ~Sanitized Subject (redact PII)~

~Note: allow the app to submit a flag to force uniqueness for things like emails with an order#~

mike-audi commented 2 years ago

~We can generate a new unique occurrence id (OID) for each email by hashing the message id with the receiver email address~

mike-audi commented 2 years ago

We can compute the occurrence id (OID) for an email by:

sha256(sender_email, sanitized_subject, date_range_number)

where date_range_number is time since epoch divided by block size, rounded down.

for example

current time (s): 1645827065
block size (s): 259200 [3 days]

date range number: 1645827065/259200 = 6349