mozilla / ensemble

The platform that powers the Firefox Public Data Report :violin: :trumpet: :musical_keyboard:
https://data.firefox.com/
Mozilla Public License 2.0
20 stars 14 forks source link

Send an alert when data is stale #303

Open openjck opened 5 years ago

openjck commented 5 years ago

An alert should be sent when the site is showing stale data. See #297 as an example of when this has happened.

It's unclear to me if this should be done here, in ensemble-transposer, or in Fx_Usage_Report. Perhaps more than one.

pdehaan commented 5 years ago

Not sure where to put it (here vs transposer), but I did have a rough script that returns how stale the YAU data is and returns a value such as "-6d" (or 6 days old): https://github.com/pdehaan/ensemble-data-test

Per https://github.com/mozilla-services/Dockerflow, services should have a /__heartbeat__ endpoint (similar to /__version__ which tells us which SHA is deployed):

Respond to /__heartbeat__ with a HTTP 200 or 5xx on error. This should check backing services like a database for connectivity and may respond with the status of backing services and application components as a JSON payload.

So, maybe we just check a couple of choice endpoints to see what the latest date in the dataset is, and return a 500 error if the data is more than -7d old. Then we'd need to make sure OPs is monitoring that heartbeat endpoint and then maybe they ping us if the data is stale. Not sure how it'd work w/ their monitoring tools. I would hate to think that somebody on pagerduty gets paged at 3am on a Sunday because the data is 8 days old.

pdehaan commented 5 years ago

Here's an example of scraping the https://data.firefox.com/dashboard/hardware dashboard and grabbing the select#date-selector option:first-child text using a headless puppeteer:

const ms = require("ms");
const puppeteer = require("puppeteer");

async function main() {
  const sel = "select#date-selector option";

  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto("https://data.firefox.com/dashboard/hardware", {waitUntil: "networkidle2"});
  await page.waitForSelector(sel);
  const lastModified = await page.$eval(sel, el => el.textContent);
  const diff = getAge(lastModified);

  console.log(`[${diff}] ${lastModified}`);

  if (parseInt(diff, 10) < -7) {
    process.exitCode = 1;
  }
  browser.close();
}

function getAge(date) {
  return ms(new Date(date) - Date.now());
}

main();

Although it isn't especially speedy since it takes about 5s to launch a headless browser and wait for the page to load/render:

$ time node check-hardware-dashboard.js
[-12d] February 3, 2019

node check-hardware-dashboard.js  0.44s user 0.19s system 12% cpu 5.020 total
openjck commented 5 years ago

Excellent. Thank you, @pdehaan! I'll look into this.

openjck commented 5 years ago

Note to self: the code in #297 is also worth looking at.

openjck commented 5 years ago

@pdehaan launched a site which reports on the freshness of all data. This could be a great thing for us to leverage. :smiley:

Site: https://ensemble-last-modified.now.sh/

Repo: https://github.com/pdehaan/ensemble-last-modified

pdehaan commented 5 years ago

https://ensemble-last-modified.now.sh/ is currently saying the dashboard data is currently 9-10 days old:

{
  "source": "https://github.com/mozilla/ensemble",
  "version": "1.0.0",
  "commit": "5753d4021c792b3af31174a8cb473c10549f82ae",
  "dashboads": {
    "/datasets/desktop/user-activity": "-10d",
    "/datasets/desktop/usage-behavior": "-10d",
    "/datasets/desktop/hardware": "-9d"
  },
  "homepage": "https://github.com/pdehaan/ensemble-last-modified"
}
openjck commented 5 years ago

Ah! Thanks for the heads up!