w3c / webextensions

Charter and administrivia for the WebExtensions Community Group (WECG)
Other
599 stars 56 forks source link

Proposal: Add filter for DNR modifyHeaders #439

Open carlosjeurissen opened 1 year ago

carlosjeurissen commented 1 year ago

Problem

Currently conditionally modifying headers is generally only possible with webRequest. This proposal is there to make sure dnr can replace as many webRequest usecases.

The proposal would fix: https://crbug.com/1254637 and address some of the use cases described in https://github.com/w3c/webextensions/issues/169

Proposal

The proposal is to add a regexFilter property to ModifyHeaderInfo which only apply the operation if and only if the regex matches.

"action": {
  "type": "modifyHeaders",
  "responseHeaders": [{
    "operation": "set",
    "header": "set-cookie",
    "regexFilter": ";\\s*SameSite=(Lax|Strict)|$",
    "value": "; SameSite=None; Secure"
  }]
},

In addition, a valueFilter could be introduced to do a simple string match for performance reasons. As proposed by @tophf, both valueFliter and regexFilter could be specified, in which case valueFilter is tested first. If it fails, no regexMatch is required thus improving performance.

"action": {
  "type": "modifyHeaders",
  "responseHeaders": [{
    "operation": "set",
    "header": "Set-Cookie",
    "valueFilter": "SameSite=Lax",
    "value": "; SameSite=None; Secure"
  }]
},

Since some operations require a global match (see use case 2). So we need a way to specify them. Something like this could work:

"action": {
  "type": "modifyHeaders",
  "responseHeaders": [{
    "operation": "set",
    "header": "Content-Security-Policy",
    "regexFilter": "(^|,\\s?)",
    "regexFlags": "g",
    "value": "$1img-src 'self'; "
  }]
},

Use case 1, Set-Cookie SameSite flag

See https://bugs.chromium.org/p/chromium/issues/detail?id=1254637 Set SameSite flag of Set-Cookie header to 'None' if it originally was 'Lax' or 'Strict'.

Blocking WebRequest

function responseListener (details) {
  const responseHeaders = details.responseHeaders;
  responseHeaders.forEach((header) => {=
    if (header.name === 'Set-Cookie'){
        header.value = header.value.replace(/;\s*SameSite=(?:Lax|Strict)|$/, '; SameSite=None; Secure');
    }
  });
  return { responseHeaders: responseHeaders };
}

chrome.webRequest.onHeadersReceived.addListener(
  responseListener,
  { urls: ['*://*/*'] },
  ['blocking', 'responseHeaders', 'extraHeaders'],
);

DNR proposal

"action": {
  "type": "modifyHeaders",
  "responseHeaders": [{
    "operation": "set",
    "header": "set-cookie",
    "regexFilter": ";\\s*SameSite=(Lax|Strict)|$",
    "value": "; SameSite=None; Secure"
  }]
},

Use case 2, CSP changes, set directive value

Change img-src value of the 'Content-Security-Policy' header. Keep in mind this does not remove existing img-src properties. It will simply inject another in front. The old one will be ignored by the browser as this is how CSP is designed. For removing an existing directive value, see use case 3. It does however handle multiple CSP set in the same header using a comma. This will require the global search regex flag.

Example CSP is: default-src 'self'; img-src 'none'; frame-src 'none', default-src 'none'; frame-src 'self', script-src 'none'; default-src 'none'

Blocking WebRequest

function responseListener (details) {
  const responseHeaders = details.responseHeaders;
  responseHeaders.forEach((header) => {
    if (header.name === 'Content-Security-Policy') {
      header.value = header.value.replace(/(^|,\\s?)/, "$1img-src 'self'; ");
  }
  });
  return { responseHeaders: responseHeaders };
}

chrome.webRequest.onHeadersReceived.addListener(
  responseListener,
  { urls: ['*://*/*'] },
  ['blocking', 'responseHeaders', 'extraHeaders'],
);

DNR proposal

"action": {
  "type": "modifyHeaders",
  "responseHeaders": [{
    "operation": "set",
    "header": "Content-Security-Policy",
    "regexFilter": "(^|,\\s?)",
    "regexFlags": "g",
    "value": "$1img-src 'self'; "
  }]
},

Use case 3, CSP changes, remove directive

Change img-src value of the 'Content-Security-Policy' header.

Blocking WebRequest

function responseListener (details) {
  const responseHeaders = details.responseHeaders;
  responseHeaders.forEach((header) => {
    if (header.name === 'Content-Security-Policy') {
      header.value = header.value.replace(/([ ;,^])img-src.*?(;|,|$)/g, "$1$2");
  }
  });
  return { responseHeaders: responseHeaders };
}

chrome.webRequest.onHeadersReceived.addListener(
  responseListener,
  { urls: ['*://*/*'] },
  ['blocking', 'responseHeaders', 'extraHeaders'],
);

DNR proposal

"action": {
  "type": "modifyHeaders",
  "responseHeaders": [{
    "operation": "set",
    "header": "Content-Security-Policy",
    "regexFilter": "([ ;,^])img-src.*?(;|,|$)",
    "regexFlags": "g",
    "value": "$1$2"
  }]
},
tophf commented 1 year ago

only one of regexFilter and valueFilter can be set

or allow both and use valueFilter as a fast pre-check (and do the same for urlFilter/regexFilter)

carlosjeurissen commented 1 year ago

Updated the post with several use cases; Implemented the suggestion of @tophf and added regexFlags.

ameshkov commented 1 year ago

Just wanted to mention that we would also like to see this implemented.

Currently, there's no easy way to modify headers and this is often required. The most important use case for us is modifying cookies, i.e. changing max time, samesite, etc.

In TPAC minutes it was mentioned by @zombie that regular expressions is not the right tool, and while I agree that it's not ideally suited for the job, I don't know any other declarative alternative for modifying a string value. One can argue that headers often have a dictionary structure internally, but even in this case you'll need something like a regular expression to make the change to any dictionary value.

Rob--W commented 7 months ago

FYI: Safari and Firefox are likely going to use a regular expression engine similar to the JS implementation, whereas Chrome will use RE2.

Safari will intentionally not restrict the regex syntax to the regexFilter as in #344.

Celsius273 commented 4 months ago

Hello,

this is next up on my plate.

As referenced in issue 460, we may consider adding regex filters for response header values inside the HeaderCondition.

This means that regexFilter inside ModifyHeaderInfo may be redundant, but could still be useful towards compressing multiple rules into one.

I think the use cases highlighted in @carlosjeurissen 's comment that requires the regexFilter in ModifyHeaderInfo are the second and third ones.

Exploratory spec

Currently, DNR uses RE2 as the regex syntax for regex based url filters. For now, I propose that regex filters for headers will still use RE2 for consistency.

Note: valueFilter is omitted here since it is already defined in HeaderInfo

dictionary ModifyHeaderInfo {
  // The name of the header to be modified.
  DOMString header;

  // The operation to be performed on a header.
  HeaderOperation operation;

  // The new value for the header. Must be specified for "append".
  // One of this or regexSubstitution must be specified for"set".
  DOMString? value;

  // Add the following new fields:

  // A regular expression to match against the header value. This follows the
  // RE2 syntax.
  DOMString? regexFilter;

  // Substitution pattern for the response header. Only one of value or regexSubstitution
  // may be specified. This field is only valid for the "set" operation. The syntax for this
  // field is the same as the regexSubstitution field for redirects.
  DOMString? regexSubstitution;

  // Whether the substitution should be applied for all matches. Equivalent to the "g"
  // global flag... Note that RE2 does not use this flag but it does contain a
  // "FindAndConsume" method [1] which will match multiple times.
  // Default false.
  bool? match_all;
};

Perhaps match_all can be extended to a regex_flags struct for future-proofing.

This is just an exploratory spec based on the current regex capabilities of RE2 and DNR, but early feedback is welcome. I hope this will fulfill the use cases highlighted but do let me know if I'm missing anything.

Thanks, Kelvin