pentestfail / TA-Mixpanel

Provides Splunk modular inputs & framework to ingest formatted or raw data from Mixpanel APIs.
MIT License
0 stars 3 forks source link

Add-on for Mixpanel

Add-on to support ingestion of Mixpanel data into Splunk.

App Setup:

  1. Prior to installing, retrieve your Mixpanel project's API Key and Secret from Mixpanel account (take care to protect them as you would passwords).

  2. Once retrieved, return to the add-on's "Configuration" page and "Mixpanel Project" tab to add new Mixpanel Project credentials.

  3. Once successfully completed, you may add inputs via the add-on's "Inputs" page, clicking "Create New Input", then clicking an input type you wish to create.

Export API Input:

Accesses Mixpanel's data export API which provides raw events from date range specified with daily granularity (returning all events by day). This app will initially pull the previous day's events at a daily interval and maintain a checkpoint of its last successful export. Depending on your project's event volume, exporting larger time spans could cause Splunk to index a very large amount of data. Due to the limited granularity of this API, running it more frequently than once (1) daily would create duplicate records.

This input requires the Mixpanel project timezone to correctly set input checkpoints. Future updates may expand the use of this configuration to simplify setup, etc.

Live API Input (UNOFFICIAL):

Accesses Mixpanel's "Live" API which is used in the Mixpanel UI to display events as they arrive. This input leverages this unofficial & undocumented API to provide visibility with higher frequency than the data export API at the tradeoff of accuracy. Per Mixpanel documentation, some browser clients can significantly delay sending events to Mixpanel (such as mobile apps) and will likely not be displayed via this input. To prevent duplication of events, this input maintains a checkpoint of its last successful execution and discards events with timestamps dated prior to that checkpoint.

"People" Engage API Input:

Accesses Mixpanel's "People" API (officially documented as "Engage API") which provides summarized/deduplicated user information often required for correlating or attributing user activity. The data returned from this input may be indexed but is often more useful when put into a kvstore collection for performing lookups. This input does not index data by default but creates a kvstore collection based on the name given to the input.

KVStore name format:

TA_Mixpanel_People_{Input Name}

The fields returned by this API will vary based on your project's configurations but could include information such as: browser type and version, geolocation information, more detailed user metadata (name, email, phone, etc.), and more. Due to probable performance impacts from the high variability of fields and types, this input does not create kvstore or lookup fields configurations dynamically.

By default, this input provides the base field schema for events returned and will require modification of the "KVStore Fields" JSON object ({} or python dictionary) of the input's configuration. You must configure each field's name and type as a valid JSON object for kvstore and lookup configurations to be auto generated. By default, this input removes dollar symbols ("$") from field names due to incompatability with the Splunk kvstore. When referenceing field names, DO NOT use the dollar symbols.

Example record from Mixpanel:

{"$distinct_id": 4,
              "$properties": {"$created": "2008-12-12T11:20:47",
                              "$email": "example@mixpanel.com",
                              "$first_name": "Example",
                              "$last_name": "Name",
                              "$last_seen": "2008-06-09T23:08:40",}}

Example "KVStore Fields" configuration:

{"distinct_id":"number","properties.created":"string","properties.email":"string","properties.first_name":"string","properties.last_name":"string","properties.last_seen":"string"}

Default "KVStore Fields" configuration (base schema):

{"distinct_id":"string","properties":"array"}

For detail on kvstore fields and types, see Splunk kvstore documentation.

Timezone Configuration:

For the Mixpanel Export API, the timezone is based on your Mixpanel project's configurations. Configure props.conf "TZ" setting per Splunk's documentation to match your projects configurations. If ingesting multiple projects with with different timezone configurations, you should configure additional sourcetypes or remove timestamp parsing from props.conf. See the Mixpanel API documentation for more information on timestamps and the Export API.

Release Notes:

v1.0.4 Initial release

Initial release with basic documentation.

Submit issues or requests via Github:

TA-Mixpanel Github Repo: https://github.com/pentestfail/TA-Mixpanel