simonw / covidsewage-bot

The @covidsewage bot
14 stars 5 forks source link

They redesigned the page, you have to click to get the chart now #7

Closed simonw closed 1 month ago

simonw commented 1 month ago

https://fedi.simonwillison.net/@covidsewage/113090984038985201

image

You have to click that "COVID" button now to get the chart.

simonw commented 1 month ago

This was SO hard to figure out! Power BI is a nightmare to script.

Eventually I found this recipe worked for clicking that button (which has a click handler on an SVG path). First find the <visual-modern> wrapper element that represents the whole button:

var el = Array.from(document.querySelectorAll(".ui-role-button-text"))
    .filter((el) => el.textContent.trim() == "COVID")[0]
    .closest("visual-modern")

Now we need to click whichever <path> has a click handler - eventually I figured out that just clicking everything inside that <visual-modern> element did the trick:

function clickAllMatchingDescendants(rootElement, selector) {
  // Find all matching descendants
  const matchingElements = rootElement.querySelectorAll(selector);
  // Dispatch click event to each matching element
  matchingElements.forEach((element) => {
    element.dispatchEvent(
      new MouseEvent("click", {
        bubbles: true,
        cancelable: true,
        view: window,
      }),
    );
  });
}

clickAllMatchingDescendants(el, "path");
simonw commented 1 month ago

There are various animations and loading states going on here, I found that waiting for a few seconds for the first load and a few more after clicking on the <path> got me the result I wanted:

shot-scraper 'https://app.powerbigov.us/view?r=eyJrIjoiNDRiZjI5MmUtNDUxNC00YzQ1LTg0ZjktMzg2ZDA3Y2M4NjJlIiwidCI6IjBhYzMyMDJmLWMzZTktNGY1Ni04MzBkLTAxN2QwOWQxNmIzZiJ9' --javascript '
new Promise((takeShot) => {
  setTimeout(() => {
    var el = Array.from(document.querySelectorAll(".ui-role-button-text"))
      .filter((el) => el.textContent.trim() == "COVID")[0]
      .closest("visual-modern");

    function clickAllMatchingDescendants(rootElement, selector) {
      // Find all matching descendants
      const matchingElements = rootElement.querySelectorAll(selector);
      // Dispatch click event to each matching element
      matchingElements.forEach((element) => {
        element.dispatchEvent(
          new MouseEvent("click", {
            bubbles: true,
            cancelable: true,
            view: window,
          }),
        );
      });
    }

    clickAllMatchingDescendants(el, "path");

    setTimeout(() => {
      // Resolving the promise takes the shot
      takeShot();
    }, 2000);
  }, 3000);
});
'

Produces:

app-powerbigov-us-view

simonw commented 1 month ago

OK, ran it and it worked! https://fedi.simonwillison.net/@covidsewage/113093498199378926

image

1ec5 commented 1 month ago

In case it helps, the script I’ve been using to scrape the conventional case and death counts for Wikimedia Commons since the start of the pandemic calls essentially the same undocumented Power BI APIs that the client does, based on network-inspecting the dashboard. This approach works for any dashboard, though it requires a bit of hair-pulling reverse engineering.

The scraper has been stable for the past year, even with the recent redesign, since it isn’t dependent on the rendered output. However, I’ve had to occasionally (about weekly) manually load the dashboard so that the various cached queries will continue to load. Other Power BI dashboards suffer from the same issue, especially Alameda County’s. I’m sure there’s a way around that, but I never bothered to investigate the issue more deeply.

fasiha commented 2 weeks ago

Sorry to be a pest, looks like the bot hasn't posted since this was marked as completed (September 6)?