promotedai / ios-metrics-sdk

iOS client library for Promoted.ai metrics tracking.
MIT License
7 stars 1 forks source link

Indicate when metrics library receives a null/empty ID #149

Closed yunapotamus closed 2 years ago

yunapotamus commented 3 years ago

We want to figure out where we are dropping IDs. Proposal is to create an enum to indicate the state of the ID when logged on the client.

// Indicates the value of the ID when logged from metrics client library.
enum ClientProvenance {
  UNKNOWN = 0;
  // Set to null by client code.
  NULL;
  // Set to the empty string by client code.
  EMPTY;
  // Not set by client code, and also not autogenerated by metrics library.
  UNSET;
  // Automatically generated by metrics library.
  AUTOGENERATED;
  // Set to a specific value by client code.
  CLIENT_SPECIFIED;
}

Then in Event protos:

message Impression {
  string impression_id = a;
  ClientProvenance impression_id_provenance = x;
  string content_id = b;
  ClientProvenance content_id_provenance = y;
  string insertion_id = c;
  ClientProvenance insertion_id_provenance = z;
}

For example:

logView()
  => view_id_provenance = AUTOGENERATED  // Internal ID

logImpression(contentID: "Foo", insertionID: "Bar")
  => content_id_provenance = CLIENT_SPECIFIED,
     insertion_id_provenance = CLIENT_SPECIFIED,
     impression_id_provenance = AUTOGENERATED,
     view_id_provenance = AUTOGENERATED  // Propagated from view

logAction(contentID: nil, insertionID: "") 
  => content_id_provenance == NULL,
     insertion_id_provenance == EMPTY,
     impression_id_provenance == AUTOGENERATED,
     action_id_provenance == AUTOGENERATED,
     view_id_provenance = AUTOGENERATED

logAction(contentID: "Foo") 
  => content_id_provenance == CLIENT_SPECIFIED,
     insertion_id_provenance == UNSET,
     impression_id_provenance == AUTOGENERATED,
     action_id_provenance == AUTOGENERATED,
     view_id_provenance = AUTOGENERATED

This behavior would be controlled by a client config field, say diagnosticsIDsIncludeProvenance. The default behavior when this field is false is the previous behavior, which does not include provenance.

jinyius commented 3 years ago

super-nit: can we prefix these strings so we know it's not just some python special thing? :)

__client_null__ __client_empty__ __client_unset__

i need to think through a bit if these would be acceptable to define an inference scope or not for inf. ref.

yunapotamus commented 3 years ago

Updated proposal is to create an enum to indicate the state of the ID when logged on the client.

// Indicates the value of the ID when logged from metrics client library.
enum ClientProvenance {
  UNKNOWN = 0;
  // Set to null by client code.
  NULL;
  // Set to the empty string by client code.
  EMPTY;
  // Not set by client code, and also not autogenerated by metrics library.
  UNSET;
  // Automatically generated by metrics library.
  AUTOGENERATED;
  // Set to a specific value by client code.
  CLIENT_SPECIFIED;
}

message Impression {
  string impression_id = a;
  ClientProvenance impression_id_provenance = x;
  string content_id = b;
  ClientProvenance content_id_provenance = y;
  string insertion_id = c;
  ClientProvenance insertion_id_provenance = z;
}
andrewdyates commented 3 years ago

This looks great thank you

jinyius commented 3 years ago

null/empty seems not like the others. it's client specified to these special values. maybe CLIENT_NULL and CLIENT_EMPTY?

wrt unset, when do ids get autogenerated vs remain unset in sdks? and does autogenerated cover just sdks, or does it also include event api id population as well?

yunapotamus commented 3 years ago

Do you think this enum will be useful for later steps in the pipeline? It's intended for client library only (hence NULL instead of CLIENT_NULL since Client is in the enum name :-) ) but if it would be useful to others we can extend it.

yunapotamus commented 3 years ago

Autogenerated gets propagated to later values, I'll add an example.

ie. If logView autogenerates view_id = abcd-1234, and logImpression uses that autogenerated value view_id = abcd-1234, then Impression.view_id_provenance = AUTOGENERATED.

prm-dan commented 3 years ago

Overall, looks great.

Are you planning on using ClientProvenance for non-ID cases? Otherwise, can we call this IdProvenance? I'm okay with the word Provenance for this. It'd be great to have a shorter name but I couldn't think of a better one.

I think we'll use this mostly for debugging. I don't think we'll use these provenance in most parts of metrics and the consumers of metrics. I wouldn't be shocked if we eventually strip this out of our flat records.

yunapotamus commented 3 years ago

I wasn't planning on using ClientProvenance for anything else. I'm also fine with IdProvenance. I'll send out a schema PR soon.

Yeah, this seems like a debugging tool for me. I'll hook it up to the diagnostics settings for client config.