citp / news-disinformation-study

A research project on how web users consume, are exposed to, and share news online.
8 stars 2 forks source link

Figure out why webRequest.onHeadersReceived and webRequest.onBeforeRedirect events don't fire on Fetch API requests with manual redirect following #42

Closed jonathanmayer closed 4 years ago

jonathanmayer commented 4 years ago

This looks like it might be an issue in Firefox. See: https://github.com/citp/web-science/issues/36#issuecomment-563371568

jonathanmayer commented 4 years ago

Looked into this more. Here's what I think is going on... If we use the Fetch API with redirect set to manual, Firefox applies a security policy to the HTTP channel that prohibits redirects. If the Fetch API is then told to follow the opaque redirect, Firefox sets up a new channel to handle the redirect. The result is that observers on the original channel listening for a redirect (including via browser.webRequest.onBeforeRedirect) don't fire.

There's the bad news. Here's the good news: the Firefox approach to Fetch redirects shouldn't affect webRequest.onHeadersReceived, since the HTTP channel still processes the response as usual. I tested the following snippet in both Release and Nightly, and the results are as expected: the webRequest.onHeadersReceived listener fires and the browser.webRequest.onBeforeRedirect doesn't fire.

// Checking for the onHeadersReceived event
browser.webRequest.onHeadersReceived.addListener(details => {
  console.log("onHeadersReceived: " + JSON.stringify(details));
}, { urls: [ "https://t.co/*" ] }, [ "responseHeaders" ]);

// Checking for the onBeforeRedirect event
browser.webRequest.onBeforeRedirect.addListener(details => {
  console.log("onHeadersReceived: " + JSON.stringify(details));
}, { urls: [ "https://t.co/*" ] }, [ "responseHeaders" ]);

// Using the Fetch API to resolve a t.co shortened URL
// Note that Twitter has started to use JS-based redirects for browsers,
// so I've set the User-Agent to empty to make sure there's a Location
// response header
window.fetch("https://t.co/acH8UR5A4w",
  {
    redirect: "manual",
    headers: new Headers({
      "User-Agent": ""
    })
  });

The output looks like:

onHeadersReceived: {
    "requestId": "18",
    "url": "https://t.co/acH8UR5A4w",
    . . .
    "responseHeaders": [ . . . {
        "name": "location",
        "value": "https://nyti.ms/36DDUKr"
    } . . .  ],
    "statusCode": 301,
    "statusLine": "HTTP/2.0 301 Moved Permanently",
    . . .
}

So we have what we need. Note that we'd have to resolve this particular URL a couple more times, since it goes from t.co to nyti.ms to trib.al and then to www.nytimes.com.

@PranayAnchuri, you reported that webRequest.onHeadersReceived wasn't working with Fetch requests and redirect set to manual; that's why we came up with the awkward ping-pong between Fetch requests with redirect following, watching for the redirect with browser.webRequest.onBeforeRedirect, and then canceling the redirected request with a blocking listener to browser.webRequest.onBeforeRequest. Am I missing something, or can we just use webRequest.onHeadersReceived?

jonathanmayer commented 4 years ago

Confirmed with @PranayAnchuri that the webRequest.onHeadersReceived approach should work. Closing the issue.