backtracks / open-podcast-analytics

Open podcast analytics specification
Apache License 2.0
69 stars 8 forks source link

Open Podcast Analytics (OPA)

The Open Podcast Analytics specification is a a unified spec for providing an open and common interface for sending podcast analytics related events. OPA is based on the Open Audio Analytics specification with extensions specifically for podcasting analytics. Apps, clients (e.g. podcast discovery and listening software), and serverside media hosting software transmit data in the format in this specification. Analytics providers with specific domain knowledge of audio and/or podcasting can provide insights that are not available with the data from a generic domain non-specific format.

Open Podcast Analytics (OPA) - Logo

How does it work?

The simple protocol defines a common data interchange format and behaviors to allow a variety of mobile and desktop applications to emit/publish podcast analytics related events over the internet. Applications/clients may send one or more events in one call which enables queuing and batching. The data sent in the protocol is not sensitive data (it is still recommended to send data over a HTTPS connection), yet is a sufficient amount of data for analytics services conforming/consuming to the protocol to provide analytics services based on that data.

Example Data

[
  {
    "name":"media.download",
    "client":"Example App",
    "client_version":"1.0.0",
    "author":"Series Author or Series Title",
    "title":"Episode Title",
    "publisher": "Example Podcast Publisher",
    "publisher_url": "http://example.com",
    "playbackRate":1,
    "volume":1,
    "muted":false,
    "paused":false,
    "currentTime":0,
    "duration":3599.574513,
    "explicit":false,
    "user_id":"4300f781-0044-4d69-a8cc-4b0be0fcad0f",
    "src":"https://www.example.com/media/example.mp3",
    "media_ids": [
       {
          "id": "624fb990cdd94623b77f41fef0aa0e1d",
          "type": "uuid",
       },
       {
          "id": "Episode 25 of Example Podcast",
          "type": "guid",
       }
    ],
    "categories": [
       {
          "label":"Society & Culture",
          "label_encoded":true,
          "categories": [
             {
                "label":"History"
             }
          ]
       },
       {
          "label":"Music"
       },
    ],
    "tags": [
       "hot",
       "new",
       "usa"
    ],
    "series": {
          "label": "Example Podcast",
          "type": "series"
    },
    "season": {
          "label": "Season 1",
          "number": 1,
          "type": "season"
    },
    "episode": {
          "label": "Episode 25",
          "number": 25,
          "type": "episode"
    },
    "time":"2016-10-16T00:23:13.411Z"
  },
  {
      "name":"media.play",
      "client":"Example App",
      "client_version":"1.0.0",
      "author":"Series Author or Series Title",
      "title":"Episode Title",
      "publisher": "Example Podcast Publisher",
      "publisher_url": "http://example.com",
      "playbackRate":1,
      "volume":1,
      "muted":false,
      "paused":false,
      "currentTime":0,
      "user_id":"4300f781-0044-4d69-a8cc-4b0be0fcad0f",
      "duration":3599.574513,
      "time":"2016-10-16T00:23:13.411Z"
   }
]

Example Data (Using Payload Reduction / Memo Feature)

[
  {
    "name":"media.download",
    "playbackRate":1,
    "volume":1,
    "muted":false,
    "paused":false,
    "currentTime":0,
    "user_id":"4300f781-0044-4d69-a8cc-4b0be0fcad0f",
    "memo": "abcd12",
    "time":"2016-10-16T00:23:13.411Z"
  },
  {
      "name":"media.play",
      "playbackRate":1,
      "volume":1,
      "muted":false,
      "paused":false,
      "currentTime":0,
      "user_id":"4300f781-0044-4d69-a8cc-4b0be0fcad0f",
      "memo": "efgh33",
      "time":"2016-10-16T00:23:13.411Z"
   }
]

Design Considerations

The protocol/specification is designed to have the ability to be efficiently utilized in both clientside and serverside scenarios and leverage pervasive and known technologies and standards. Some of the efficiency comes from aggregate network traffic at scale, limited code footprint, and the familiarity with concepts originating in specs like HTML, HTML5, etc. The protocol/specification's ability to allow queuing and batching of events allows systems to react to a variety of scenarios including intermitten internet connections.

Casing of property names is also something that was taken into account. Properties like currentTime are in lower camel case since their origin is likely from a language or variable that is already in lower camel case such as HTML5 Media/Audio resulting in less "variable name/value translation overhead." For custom variables snake case may be more appropriate. The structure of the data payload for core scenarios is also minimally nested by design. The protocol uses existing platform agnostic standards like ISO 8601 formatted dates vs. Unix/Epoch Time when appropriate and property names and values like playbackRate mirror HTML5 Audio/Media properties.

Protocol Event Object Properties

Top Level Properties

Property Name Type Required Description
name string(255) Yes Name of the event and generally takes the form of major_topic.sub_topic
nonce string(255) No Check value on uniqueness of an event. When specified the value shall be an arbitrary string and functions as as a nonce. The property is also useful in retry scenarios.
time date/string Yes An ISO 8601 compatible date/time stamp in UTC of the time the event occurred

Top Level Extended Properties

In the table below the Mirrors HTML5 column indicates if the property name and value mirrors an HTML5 Media and/or Web Audio API property of the same name. Custom properties are allowed. Some properties have a default value if they are omitted and/or not applicable in particular event scenario (this can also reduce payload size as the property does not need to be included in the payload if the default value should be utilized).

Property Name Type Required Default (If Omitted) Description Mirrors HTML5
author string(255) No N/A Name of the author or artist of the media/work No
client string(255) No N/A Unique identifier or name of the client or media player performing the action related to the request No
currentTime number/double Yes Current time of the client/user in media playback in seconds. Yes
duration number/double Yes Length of in media in seconds. Yes
explicit boolean No False If true, the media contains explicit content. Yes
loop boolean No False Indication of if the media is set to loop on the end of playback. Yes
media_ids array(Media Id Type) No Array of Media Id Type. See Media Id Type for a type definition. No
muted boolean Yes False Indication of if the media is muted. For example the usual volume setting of the media may be at 1 (100%), however the client has the media muted. Yes
networkState integer/short No An integer in the set of 0-3 that indicates the current network state. Yes
paused boolean No False Indication of if the media is paused. Yes
playbackRate number/double No 1 A number like 1 or 1.5 that indicates the relative speed of playback of the media where 1 = normal speed and values above or below 1 indicate a speed/rate change in relation to the normal value of 1. Zero (0) is not a valid value. Yes
publisher string(255) No Name of the publisher of the media/work related to the request (this may be different than the author) No
readyState integer/short No An integer in the set of 0-4 that indicates the current media readiness state for playback. Yes
src string(255) No URL of the media Yes
title string(255) No src Title of the media/work No
user_id string(255) No Unique identifier for the user performing the action related to the request. This identifier is typically unique to an application or organization. No
volume number/double No 1 A number between 0 and 1 (where 0 = 0% and 1 = 100%) that indicates the volume setting of the media. Example: .75 = 75% volume Yes

Media Id Type

Property Name Type Required Description
id string(255) Yes Unique id of the media
type string(255) Yes Type of media id, see Known Media Id Types below for known values. This is an open-ended property and not a restricted enum so the type value may be any valid string.

Known Media Id Types

Type Name Description
barcode Unique, generally commercially registered, idenitider where the value can be a upc, ean, etc. (UPC and EAN are really just both versions of the same underlying barcode structure)
uuid "Arbitrary" universally unique identifier. This is sometimes called a [guid]() or "Globally Unique Identifier".
isbn International Standard Book Number
issn International Standard Serial Number
isrc International Standard Recording Code
iswc International Standard Musical Work Code
org_uuid Organization, application, publisher identifier such as a release number or an internal id that is unique to an organization

Event Submission Workflow Example

  1. Construct the JSON formatted data payload representing one or more events
  2. Convert the data payload to an array of bytes
  3. Convert the array of bytes to a Base 64 encoded string (with padding)
  4. Transmit the HTTP request as a GET or POST request
  5. Receive the HTTP response

When sending a HTTP request, standard HTTP headers such as User-Agent should be included. User-Agent shall also be populated for requests originating from with mobile applications.

When sending a HTTP GET request, populate the querystring parameter d with the base64 encoded data. When sending a HTTP POST request, populate the body of the request with key-value pairs of parmeters where the d parameter is the Base 64 encoded data and set the Content-Type header of the HTTP request to `application/x-www-form-urlencoded. When data is sent using a HTTP POST with the Content-Type of application/json the data (e.g. value of the d parameter) shall not be Base 64 encoded and will be in the clear.

If/when receiving a HTTP response, typical HTTP status codes apply where a status code in the 200 range indicates success and 400 and 500 series status code indicates a failure.

Example Event Submission

Here is an example of what an event submission using the protocol represented in HTTP request headers.

Actual Requests

GET /?d=W3sibmFtZSI6Im1lZGlhLnBsYXkiLCJhdXRob3IiOiJKb25hdGhhbiBHaWxsIiwidGl0bGUiOiJFeGFtcGxlIFRpdGxlIiwicGxheWJhY2tSYXRlIjoxLCJ2b2x1bWUiOjEsIm11dGVkIjpmYWxzZSwicGF1c2VkIjpmYWxzZSwiY3VycmVudFRpbWUiOjIuOTk3NzI5LCJkdXJhdGlvbiI6MzU5OS41NzQ1MTMsImN1c3RvbV9wcm9wZXJ0eTEiOiJBbnl0aGluZyB5b3Ugd2FudCIsImN1c3RvbV9wcm9wZXJ0eTIiOjMzLjMzLCJ0aW1lIjoiMjAxNi0xMC0xNlQwMDoyMzoxMy40MTFaIn0seyJuYW1lIjoibWVkaWEudGltZXVwZGF0ZSIsImF1dGhvciI6IkpvbmF0aGFuIEdpbGwiLCJ0aXRsZSI6IkV4YW1wbGUgVGl0bGUiLCJwbGF5YmFja1JhdGUiOjEsInZvbHVtZSI6MSwibmV0d29ya1N0YXRlIjoxLCJyZWFkeVN0YXRlIjo0LCJtdXRlZCI6ZmFsc2UsInBhdXNlZCI6ZmFsc2UsImN1cnJlbnRUaW1lIjozLjE1NDQzNCwiZHVyYXRpb24iOjM1OTkuNTc0NTEzLCJ0aW1lIjoiMjAxNi0xMC0xNlQwMDoyMzoxMy40MTFaIn1d HTTP/1.1
Host: example.com
Connection: keep-alive
User-Agent: Mozilla/5.0 (NeXTStep 3.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36
Accept-Language: en-US,en;q=0.8

Unencoded Raw Data

  [
    {
      "name":"media.play",
      "author":"Jonathan Gill",
      "title":"Example Title",
      "playbackRate":1,
      "volume":1,
      "muted":false,
      "paused":false,
      "currentTime":2.997729,
      "duration":3599.574513,
      "custom_property1": "Anything you want",
      "custom_property2": 33.33,
      "time":"2016-10-16T00:23:13.411Z"
    },
    {
      "name":"media.timeupdate",
      "author":"Jonathan Gill",
      "title":"Example Title",
      "playbackRate":1,
      "volume":1,
      "networkState":1,
      "readyState":4,
      "muted":false,
      "paused":false,
      "currentTime":3.154434,
      "duration":3599.574513,
      "time":"2016-10-16T00:23:13.411Z"
    },
  ]
POST / HTTP/1.1
Host: demo.backtracks.io
Connection: keep-alive
User-Agent: Mozilla/5.0 (NeXTStep 3.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept-Language: en-US,en;q=0.8

d=W3sibmFtZSI6Im1lZGlhLnBsYXkiLCJhdXRob3IiOiJKb25hdGhhbiBHaWxsIiwidGl0bGUiOiJFeGFtcGxlIFRpdGxlIiwicGxheWJhY2tSYXRlIjoxLCJ2b2x1bWUiOjEsIm11dGVkIjpmYWxzZSwicGF1c2VkIjpmYWxzZSwiY3VycmVudFRpbWUiOjIuOTk3NzI5LCJkdXJhdGlvbiI6MzU5OS41NzQ1MTMsImN1c3RvbV9wcm9wZXJ0eTEiOiJBbnl0aGluZyB5b3Ugd2FudCIsImN1c3RvbV9wcm9wZXJ0eTIiOjMzLjMzLCJ0aW1lIjoiMjAxNi0xMC0xNlQwMDoyMzoxMy40MTFaIn0seyJuYW1lIjoibWVkaWEudGltZXVwZGF0ZSIsImF1dGhvciI6IkpvbmF0aGFuIEdpbGwiLCJ0aXRsZSI6IkV4YW1wbGUgVGl0bGUiLCJwbGF5YmFja1JhdGUiOjEsInZvbHVtZSI6MSwibmV0d29ya1N0YXRlIjoxLCJyZWFkeVN0YXRlIjo0LCJtdXRlZCI6ZmFsc2UsInBhdXNlZCI6ZmFsc2UsImN1cnJlbnRUaW1lIjozLjE1NDQzNCwiZHVyYXRpb24iOjM1OTkuNTc0NTEzLCJ0aW1lIjoiMjAxNi0xMC0xNlQwMDoyMzoxMy40MTFaIn1d

Parameters

The following parameters shall be supported. The paremter callback can only be used successfully on GET requests due to the limitations of JSONP.

Parameter Name Type Required Description HTTP Methods Supported
callback string No JavaScript function name to call on the return of the request's response. When utilized the Content-Type of the response will be application/javascript. This follows JSONP conventions. GET
d string GET Only Base64 encoded (with padding) version of the data payload GET, POST
k string(255) No An access key, public API key, project key, id, etc. that may be utilized by analytics providers to know how to route the data being passed. This field may also serve as a method of authentication in particular scenarios. GET, POST

Examples

Placeholders

https://example.com/?k=<key>&d=<data>&callback=<callback>

Subsituted

https://example.com/?k=abc123&d=W3sibmFtZSI6Im1lZGlhLnBsYXkiLCJhdXRob3IiOiJKb25hdGhhbiBHaWxsIiwidGl0bGUiOiJFeGFtcGxlIFRpdGxlIiwicGxheWJhY2tSYXRlIjoxLCJ2b2x1bWUiOjEsIm11dGVkIjpmYWxzZSwicGF1c2VkIjpmYWxzZSwiY3VycmVudFRpbWUiOjIuOTk3NzI5LCJkdXJhdGlvbiI6MzU5OS41NzQ1MTMsImN1c3RvbV9wcm9wZXJ0eTEiOiJBbnl0aGluZyB5b3Ugd2FudCIsImN1c3RvbV9wcm9wZXJ0eTIiOjMzLjMzLCJ0aW1lIjoiMjAxNi0xMC0xNlQwMDoyMzoxMy40MTFaIn0seyJuYW1lIjoibWVkaWEudGltZXVwZGF0ZSIsImF1dGhvciI6IkpvbmF0aGFuIEdpbGwiLCJ0aXRsZSI6IkV4YW1wbGUgVGl0bGUiLCJwbGF5YmFja1JhdGUiOjEsInZvbHVtZSI6MSwibmV0d29ya1N0YXRlIjoxLCJyZWFkeVN0YXRlIjo0LCJtdXRlZCI6ZmFsc2UsInBhdXNlZCI6ZmFsc2UsImN1cnJlbnRUaW1lIjozLjE1NDQzNCwiZHVyYXRpb24iOjM1OTkuNTc0NTEzLCJ0aW1lIjoiMjAxNi0xMC0xNlQwMDoyMzoxMy40MTFaIn1d&callback=customJavaScriptFunction

Event Examples

Samples events in their unencoded format are below:

Event Name
media.play (Basic/Sample)
media.play (Detailed/Extended)
media.pause
media.download
media.subscribe
media.unsubscribe
media.ended
media.ratechange
media.seeked
media.timeupdate
media.volumechange
ui.action

Tips and FAQs

The specification described on this page or document is available under the Apache 2.0 License. This is a dervivative work of Open Audio Analytics. This has been another Backtracks, Inc. & Friends joint.

open-podcast-analytics