promotedai / ios-metrics-sdk

iOS client library for Promoted.ai metrics tracking.
MIT License
7 stars 1 forks source link

Impressions and Actions should omit ancestor IDs if not set #118

Closed yunapotamus closed 3 years ago

yunapotamus commented 3 years ago

Background

If the client does not set up a given ancestor ID in MetricsLogger, then we should not emit that ancestor ID in events going forward.

This is a slight departure from a previous requirement where certain RPCs required the initial set of pending ancestor IDs, prior to the events for those IDs being logged.

Motivation is a case where we are not tracking users/sessions or views. We should not log those IDs in Impression events.

Proposal

Change the behavior of pending ancestor ID generation as follows:

  1. If we log a View, Action, or Impression event prior to any ancestor ID being set, then the logged event omits any unset ancestor IDs.
  2. Once we set any ancestor ID, then all events logged from that point on use that ancestor ID until changed.

Interaction with existing pending ancestor ID/getSessionInfo() functionality:

  1. When logging starts up, we generate initial values for logUserID, sessionID, and viewID as before. These values will be available in memory.
  2. If the client (not the Promoted library itself) reads any of these ancestor IDs from the public fields in MetricsLogger, say logger.logUserID, then we return the aforementioned initial value. The allows the getSessionInfo() call in react-native-metrics to behave as before.

Here's a flowchart indicating which value is returned for an ancestor ID. Ancestor ID values The "Who is reading?" distinction is the most important.

Examples

Suppose we call logger.logImpression(content) without calling any other methods to set up ancestor IDs first. The expected logged batch event is:

{
  user_info: {
  },
  impression: {
    "timing": {
      "client_log_timestamp": 123000
    },
    "impression_id": "impression-id",
    "content_id": "content-id"
  }
}

The above example differs from previous behavior in that it does not include pending values for userInfo.logUserID, impression.sessionID, or impression.viewID.

Suppose that now we set logger.logUserID = "foobar" and logger.sessionID = "batman". Then we call logger.logImpression(content) again. The expected logged batch event is:

{
  user_info: {
    log_user_id: "foobar"  // Gets populated
  },
  impression: {
    "timing": {
      "client_log_timestamp": 124000
    },
    "impression_id": "impression-id",
    "content_id": "content-id",
    "session_id": "batman"  // Gets populated
  }
}

Now suppose we call logger.logView(viewController: robinViewController). Then we call logger.logImpression(content) again. The expected logged batch event is:

{
  user_info: {
    log_user_id: "foobar"
  },
  impression: {
    "timing": {
      "client_log_timestamp": 125000
    },
    "impression_id": "impression-id",
    "content_id": "content-id",
    "session_id": "batman",
    "view_id": "robin"  // Gets populated
  }
}
yunapotamus commented 3 years ago

@danbosnichill @jinyius @dtdennis28

jinyius commented 3 years ago

i guess i'd like to strongly point out/reference #124 such that this issue calls out that it sets at least one ancestor id for any non-user/session event. we should never be in a state where there's no ancestor context. the luid (ours or the platforms) or uid (theirs) should be there on all events as an invariant. without being able to at least map to a unifying user concept, there's lots of issues for event flattening.

yunapotamus commented 3 years ago

i guess i'd like to strongly point out/reference #124 such that this issue calls out that it sets at least one ancestor id for any non-user/session event. we should never be in a state where there's no ancestor context. the luid (ours or the platforms) or uid (theirs) should be there on all events as an invariant. without being able to at least map to a unifying user concept, there's lots of issues for event flattening.

Do you mean that if no ancestor IDs are specified when you try to log an event, and there's no insertionID, then it should be an error in the logging library? ie:

func logImpression(insertionID: String?) throws {
  if ancestorLogUserID == nil && ancestorSessionID == nil &&
     ancestorViewID == nil && insertionID == nil {
    throw Error("Need at least one ancestor ID")
  }
  // else log the event
}

// Same check at start of logAction
jinyius commented 3 years ago

tl;dr: net client sdk requirement:

non-user/session related log events should always populate some ancestor id. worst-case, it should be user/session related (either luid, uid, session) that is consistent with other log event sources. if the client app/sdk is not the source for the ancestor event and the client app does not provide the sdk with the ancestor id value, then the client sdk should omit the parent it. it should be an error if we try to log these non-user/session log events without any ancestor id set.

i guess i'd like to strongly point out/reference #124 such that this issue calls out that it sets at least one ancestor id for any non-user/session event. we should never be in a state where there's no ancestor context. the luid (ours or the platforms) or uid (theirs) should be there on all events as an invariant. without being able to at least map to a unifying user concept, there's lots of issues for event flattening.

yunapotamus commented 3 years ago

I think we're in agreement on what to do in the case where we don't have ancestor IDs. Could you look at the pseudocode in my comment above and see if it handles this case?

Do you prefer if the library doesn't log an event at all if missing ancestor IDs, or that we log a (potentially useless) event that is missing ancestor IDs?

jinyius commented 3 years ago

for the currently impacted platform, remove session id b/c it's not tied to any other event logging sources afaiu. generally applies as well since dan may want to move the session concept entirely out of the platform context and strictly in our backend.

i'll leave the question of what to do w/ these orphan events to @danbosnichill. imo, it should be handled in event api to dump into an orphan kafka topic so we can at least report on this as an issue via manager later.

yunapotamus commented 3 years ago

Gotcha. So the updated proposal is:

func logImpression(insertionID: String?) {
  if ancestorLogUserID == nil && ancestorViewID == nil && insertionID == nil {
    warning("Need at least one ancestor ID")
  }
  // Log the event
}

// Same check at start of logAction

Can change the warning to an error (that doesn't log the event) per your preferences.

I'll open a new issue/PR for this behavior. Thanks!

yunapotamus commented 3 years ago

Opened #126

jinyius commented 3 years ago
func logImpression(insertionID: String?) {
  if ancestorLogUserID == nil && ancestorViewID == nil && insertionID == nil {
    warning("Need at least one ancestor ID")
  }
  // Log the event
}

// Same check at start of logAction

@danbosnichill, are you ok omitting userId from this check? if we include it, we'll have to prioritize doing uid based joining to get a consistent luid (can also extend to session and client hints) if we accept uid as a join key. basically, the inferred user stuff at the limit. i'm fine omitting for now to avoid signing us up to prioritize this work.

yunapotamus commented 3 years ago

are you ok omitting userId from this check?

I moved this discussion to #126. :-)