segmentio / analytics-android

The hassle-free way to add analytics to your Android app.
https://segment.com/docs/sources/mobile/android/
MIT License
377 stars 214 forks source link

Duplicate Events #63

Closed f2prateek closed 10 years ago

f2prateek commented 10 years ago

For sending events to the server, we have an integrations key in each payload that contains string-boolean pairs, each entry denoting whether the service is enabled for that integration.

Any integrations that are bundled, we'll have that key set to false so that we don't send via the server.

Unfortunately, we're re-using this integrations map to check whether we should also post the event to the bundled integration.

This means if we set mixpanel to false, it won't be sent to Mixpanel from the server, and it won't be sent to the bundled integration either. If we set it to true, then it will be sent by both the bundled integration and the server integration - causing duplicate events.

For the future, we'll probably want to use some sort of triplean (backed by an enum) variable, so users can decide whether something needs to be sent via the server integration, the bundled integration, or neither (both can probably be eliminated).

f2prateek commented 10 years ago

For now, I'll put up a PR which avoids changing the API significantly.

We'll have two integrations keys - one under the payload object, which is processed by the server. We'll fill this automatically with all our bundled integrations, so events don't get sent from the server too. The second one will go under the context object, this will be controlled by the options object that the user passes. We'll use it to enable disable bundled integrations, and this will get merged into the payload's integrations as well so the integration is turned off server side.

ianstormtaylor commented 10 years ago

cc @travisjeffery do we have this same issue on iOS? or did we solve it there somehow?

travisjeffery commented 10 years ago

the dictionary that you fetch from the server telling you which integrations are enabled is used to filter events coming in for whatever integrations are enabled. https://github.com/segmentio/analytics-ios/blob/master/Analytics/SEGAnalytics.m#L296-L299

and the dictionary sent to the server is a dictionary with every bundled integration set to false. https://github.com/segmentio/analytics-ios/blob/master/Analytics/Integrations/Segmentio/SEGSegmentioIntegration.m#L288-L296

this way you don't get duplicates.

the title of this issue is a bit confusing, because currently it seems like you don't have duplicates either. the issue you're describing is if they want to send events from the server from an sdk that has their integration bundled - isn't it? there's no reason why that'd be case though.

f2prateek commented 10 years ago

Currently we do have duplicates with a regular track call.

Earlier the Android SDK was sending a payload in this format like this:

{
  type: String,
  context: {
    integrations: {
      mixpanel: false // since this is bundled
    }
    ...
  },
  ...
  // but no integrations Object under the top level that is required by the server
}

So the server wouldn't look under the context's integrations map we sent, and just send events to all integrations, which would lead to duplicates on Mixpanel (one the SDK sends from the bundled integration - one from the server).

First I just changed the map under the payload root:

{
  type: String,
  ...
  integrations: {
      mixpanel: false // since this is bundled
  }
}

So now the server won't send those events to Mixpanel. But on the device itself, we use this top-level integration to map to see if the bundled integration was disabled. Once we see Mixpanel is false, we don't send it to the bundled integration, leading to no events being sent.

The changed payload looks like this

{
  type: String,
context: {
    integrations: {
      // anything intergration specified to be disabled by the user for this event
      // it will automatically be merged into the top level one, so the server doesn't send it either.
      // This will be used by the SDK to skip calls to any bundled integrations
      // e.g. Analytics.identify('212', new Options().setIntegration('xyz', false));
      xyz: false
    }
    ...
  },.
  integrations: {
      mixpanel: false, // since this is bundled
      xyz: false // merged from the user provided options, won't be sent from the server either
  },
  ...
}

So now the server can lookup the top level one to see what needs to be sent. But our SDK looks at the context level integration map, to only disable integrations explcitly set by the user (and not by us itself).

travisjeffery commented 10 years ago

ok. for the top-level integrations obj you don't need to do any merging though - given the list of integrations that are bundled that obj is static. because if the integration is bundled than the calls should go through the bundled sdk. that obj can have every bundled integration in there always set to false.

f2prateek commented 10 years ago

And what if the user wants to specifically not send an event to Mixpanel? Neither via the bundled integration or the server?

travisjeffery commented 10 years ago

integrations covers the server and context.integrations covers the bundle

f2prateek commented 10 years ago

Analytics.identify('212', new Options().setIntegration('gauges', false));

Since Gauges is not a bundled integration, we'll want to merge it in with the top level one so that the server doesn't send it to.

The top level one is a combination of all bundled integrations set to false, and whatever the user might have added programatically.

travisjeffery commented 10 years ago

i.e. disabling a non-bundled sdk.

f2prateek commented 10 years ago

I just assumed we supported that from the docs I read. @ivolo can let me know.

travisjeffery commented 10 years ago

ya https://github.com/segmentio/analytics-ios/blob/master/Analytics/Integrations/Segmentio/SEGSegmentioIntegration.m#L288-L296 supports that in the ios lib. so what i said before was confusing about that obj not changing. ya that is a real use case that should be supported.