opawg / user-agents

An open, platform-agnostic list of user-agent and referrer regexes for use in podcast analytics services
MIT License
122 stars 71 forks source link

THIS IS GOING AWAY

While this list is being kept updated, you should now be using user-agents v2. It's more performant, more regularly updated, and better for everyone.


User agent list

A list of apps, services and bots that consume podcast audio. This data is used by a number of podcast hosts to assist with their analytics.

One public example is this page at Podnews which uses this data alongside the RSS UA. We're aware that this data is used by a number of large podcast hosts and private podcasters too.

This page runs this data through a regex for 1,000 entries in OP3.

Contributing to the list

The simplest way is to add to the file at src/user-agents.json.

Each app, service or bot should have its own entry. The user_agents should be as exclusive as possible, to avoid multiple matches.

Each entry must contain the following properties:

Be careful about ensuring the file is correctly escaped.

Each entry can contain one of the following properties:

Slugs

A slug is a lowercase alphanumeric (ASCII) representation of a string, consisting only of numbers, letters and, in our case, underscores. It's up to apps that implement the list to display this information however they see fit, and using a slug is better for disambiguation.

Unknowns

It is proposed that we only specify a property above when it is known (not assumed). For example, it's often difficult to know whether an Android app is running on a phone or a tablet. We can assume that since Android tablets are rarer, almost all requests will be via Android phones, but we can't know that.

Parsing order

Multiple matches should ideally not happen for anything that has an app name; so parsing order shouldn't matter. For devices and OS, you mat discover that multiple matches will give you more accurate data, but you should hopefully only see one app name.

Testing

The /src folder contains a subfolder /tests with unit tests per programming languages. Unit tests should try to compile all the regular expressions. In case of failure, the problematic regular expressions should be fixed before pushing the changes.

python

# Running tests with pytest
pytest