HTTPArchive / custom-metrics

Custom metrics to use with WebPageTest agents
Apache License 2.0
19 stars 22 forks source link

IAB ads.txt + sellers.json metadata #91

Closed max-ostapenko closed 1 year ago

max-ostapenko commented 1 year ago

I'm looking forward to expand support of IAB standards for research on privacy and monetisation topics. Here I'm adding support for Authorized Digital Sellers. By assisting in the digital advertising supply chain transparency, it may help mitigate the risk of users’ data being misused.

Idea is to look into adoption of these, and correlation with the tracking and compliance tech used.

I started with some basic information, example with https://www.msn.com/:

{
  "ads": {
    "account_count": 1124,
    "account_types": {
      direct: {
        "domains": [
          "appnexus.com",
          "teads.tv",
          "google.com",
          ...
          "adpushup.com",
          "nativo.com",
          "snigelweb.com"
        ],
        "domains_count": 288
      },
      reseller: {
        "domains": [
          "google.com",
          "indexexchange.com",
          "media.net",
          ...
          "amxrtb.com",
          "connectad.io",
          "mobfox.com"
        ],
        "domains_count": 836
      }
    },
    "line_count": 1268,
    "present": true,
    "redirected": false,
    "status": 200,
    "variables": [
      "inventorypartnerdomain",
      "managerdomain",
      "ownerdomain",
       "subdomain"
    ],
    "variable_count": 4
  },
  "app_ads": { ... } // similar structure to "ads"
}

Update 1: domain lists removed from object in this PR. Arrays like $.account_types.direct.domains can be quite large, so I'm looking forward to your comment whether it's feasible/efficient to keep them to research advertising supply chains.

Update 2: sellers.json gives the clarity on the accounts transacting in the advertiser system.

{
  "ads": { ... },
  "app_ads": { ... },
  "sellers": {
    "present": true,
    "redirected": false,
    "status": 200,
    "seller_count": 0,
    "seller_types": {
      "publisher": {
        "seller_count": 0,
      },
      "intermediary": {
        "seller_count": 0,
      },
      "both": {
        "seller_count": 0,
      }
    },
    "passthrough_count": 0
  }
}

P.S. While working on it I also found IAB developed their own data explorer. Though I couldn't get access to that data.

rviscomi commented 1 year ago

It looks like this PR includes all of the other workflow code from #89. Should we put this in Draft mode until that gets merged? To keep things moving we could set this PR to merge into wpt-test-action so we can get a diff of changes only for the custom metric.

max-ostapenko commented 1 year ago

@rviscomi Sure, I reset this branch to narrowly scoped changes. I mistakenly expected we'd have the other one merged by this time.

I've also removed the lists of domains for now.