googlearchive / caja

Caja is a tool for safely embedding third party HTML, CSS and JavaScript in your website.
Apache License 2.0
1.13k stars 113 forks source link

URI.prototype.getQuery may unexpectedly decode "%2B" to SPACE due to order of operations #2039

Open Andrew-Cottrell opened 5 years ago

Andrew-Cottrell commented 5 years ago

https://github.com/google/caja/blob/d4635c9c014cd3d30c7e36f1d92c950d55a34916/src/com/google/caja/plugin/uri.js#L429

https://github.com/google/caja/blob/d4635c9c014cd3d30c7e36f1d92c950d55a34916/src/com/google/caja/plugin/uri.js#L501

https://github.com/google/caja/blob/d4635c9c014cd3d30c7e36f1d92c950d55a34916/src/com/google/caja/plugin/uri.js#L502

The current implementation will decode both "+" and "%2B" to SPACE. The replace operation could be performed before decodeURIComponent operation so that "+" is decoded to SPACE and "%2B" is decoded to "+".

Examples

> decodeURIComponent("x%2By+z").replace(/\+/g, ' ')
< "x y z"

> decodeURIComponent("x%2By+z".replace(/\+/g, ' '))
< "x+y z"

The reference https://www.w3.org/Addressing/URL/4_URI_Recommentations.html states

Within the query string, the plus sign is reserved as shorthand notation for a space. Therefore, real plus signs must be encoded. This method was used to make query URIs easier to pass in systems which did not allow spaces.

This indicates the expectation that "%2B" should be decoded to the plus sign.


The reference https://url.spec.whatwg.org/#concept-urlencoded-parser states

  1. Replace any 0x2B (+) in name and value with 0x20 (SP).
  2. Let nameString and valueString be the result of running UTF-8 decode without BOM on the percent decoding of name and value, respectively.

This indicates the replace operation should be performed before the decodeURIComponent operation.

kpreid commented 5 years ago

Sounds good to me. Can you open a pull request with the change?