Kong / insomnia

The open-source, cross-platform API client for GraphQL, REST, WebSockets, SSE and gRPC. With Cloud, Local and Git storage.
https://insomnia.rest
Apache License 2.0
34.4k stars 1.94k forks source link

plus sign (+) in query string incorrectly escaped to %2B #5113

Open superhawk610 opened 2 years ago

superhawk610 commented 2 years ago

Expected Behavior

Sending a GET request to http://localhost:3000/some/path?foo=bar+baz should send the following HTTP request to the server running at port 3000:

GET /some/path?foo=bar+baz HTTP/1.1

Specifically, the query string ?foo=bar+baz should be passed through unchanged, as + belongs to the set of URI reserved characters as defined in RFC 3986, Section 2.2 (emphasis mine):

2.2. Reserved Characters

URIs include components and subcomponents that are delimited by characters in the "reserved" set. These characters are called "reserved" because they may (or may not) be defined as delimiters by the generic syntax, by each scheme-specific syntax, or by the implementation-specific syntax of a URI's dereferencing algorithm. If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

 reserved    = gen-delims / sub-delims

 gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

 sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
             / "*" / "+" / "," / ";" / "="

The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI. URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent. Percent- encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications. Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI.

Here's the relevant portion:

Percent-encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications.

Actual Behavior

This request is sent to the server running on port 3000:

GET /some/path?foo=bar%2Bbaz HTTP/1.1

%2B is the correct percent-encoded equivalent to the literal plus sign +, but the plus sign is a reserved character used to indicate whitespace in URIs and thus should not be percent-encoded. Here are examples from a few popular languages' URI parsing libraries (Elixir, Node, Python) to illustrate why this is problematic:

Erlang/OTP 25 [erts-13.0.3] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Interactive Elixir (1.13.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> URI.decode_query("foo=bar+baz")
%{"foo" => "bar baz"}
iex(2)> URI.decode_query("foo=bar%2Bbaz")
%{"foo" => "bar+baz"}
Welcome to Node.js v18.8.0.
Type ".help" for more information.
> const qs = require('node:querystring')
> qs.decode('foo=bar+baz')
{ foo: 'bar baz' }
> qs.decode('foo=bar%2Bbaz')
{ foo: 'bar+baz' }
Python 3.9.12 (main, Mar 26 2022, 15:52:10)
[Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import parse_qs
>>> parse_qs('foo=bar+baz')
{'foo': ['bar baz']}
>>> parse_qs('foo=bar%2Bbaz')
{'foo': ['bar+baz']}

Reproduction Steps

No response

Is there an existing issue for this?

Additional Information

I'm opening a new issue to follow up on #1712. That issue referenced a specific URL that showed why this percent encoding was problematic, where the user expected 1 or more search results but got 0 because whitespaces in the query were replaced with +.

Insomnia Version

2022.5.1

What operating system are you using?

macOS

Operating System Version

macOS Monterey 12.3.1

Installation method

download from insomnia.rest

Last Known Working Insomnia version

n/a

superhawk610 commented 2 years ago

cc @wongstein

myplacedk commented 1 year ago

This bug seems very easy to verify.

A '+' in the query string must not be encoded, it is already an encoded space.

Go to any browser and try for example https://www.google.com/search?q=hello+world - and it will search for "hello world". Use the same URL in Insomnia, and it will search for "Hello+world". This is a bug, it should search for "hello world".

You can also just go to Google and search for "hello world", and notice that the url contains "q=hello+world".