elm / url

Build and parse URLs. Useful for HTTP and "routing" in single-page apps (SPAs)
https://package.elm-lang.org/packages/elm/url/latest/
BSD 3-Clause "New" or "Revised" License
74 stars 43 forks source link

Should "+" in query parameters be parsed as space? #32

Closed EvenAR closed 3 years ago

EvenAR commented 5 years ago

Our front page has a standard html search form. If the user types a search string, eg. "hello world" and submit the form, the user is taken to /search?query=hello+world where the elm application is located. When parsing the query parameter I expect the output to be "hello world" - however the actual output is "hello+world". Is this intentional or is this something that should be fixed? I haven't found specific documentation on this, but it seems to be common practice to handle + in query paramaters as space.


Example (21 Mar 2021):

let
    queryParser = 
        Url.Parser.s "search" <?> Url.Parser.Query.string "q"
in
Url.fromString "https://www.example.com/search?q=how+much+is+1%2B1%3F"
    |> Maybe.andThen (Url.Parser.parse queryParser)

-- Result:    Just (Just "how+much+is+1+1?")
-- Expected:  Just (Just "how much is 1+1?")
rlefevre commented 5 years ago

In the query part, it exactly means a space: https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1

malaire commented 4 years ago

In the query part, it exactly means a space: https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1

That refers to forms, not to URLs. URL standard - parsing does not mention that + should be handled specially in query.

And if you look at URL equivalence, it's clear that + and space are not equivalent and must be handled as being different.

So this issue should be closed.

emarthinsen commented 4 years ago

Standard or no standard, if you create a vanilla HTML form and use it to submit some text (via GET) with a space in it, then you'll get the a + in your URL. If you go to Google and enter a search string that has a space, you'll also see the +. We can debate whether the standard allows a plus or whether it should be handled differently, but the default behavior of the most popular browsers is to encode a plus as a space in a querystring. It would be nice if the Url library took that into account and parsed querystrings accordingly. I think the main reason it does not is because the underlying JS function that powers this doesn't decode a + into a space, but this feels like a good opportunity to add this feature and create something more useful and consistent.

malaire commented 4 years ago

When submitting a form with built-in <form action="...">, then space is encoded into +.

But that is just one way to use query strings and not the only way. Query strings can also be used for other things which have nothing to do with submitting forms - and in those cases URL standard says that space is not equivalent to +.

Also forms can be submitted without using <form action="...">, and then also space is not equivalent to +.

malaire commented 4 years ago

We can debate whether the standard allows a plus or whether it should be handled differently, but the default behavior of the most popular browsers is to encode a plus as a space in a querystring.

COMPLETELY WRONG. There isn't a single browser which encodes space as plus in query string when not submitting a form. Also standard does not allow that but specifically forbids that when not submitting a form.

CSDUMMI commented 3 years ago

How is it problematic or breaking? I mean, it is not very hard to write a program that implements compatibility with this behavior.

toForm : String -> String
toForm query = String.replace "+" " " query
EvenAR commented 3 years ago

I mean, it is not very hard to write a program that implements compatibility with this behavior.

toForm : String -> String
toForm query = String.replace "+" " " query

Note that this should be done beforeUrl.Parser.parse, when the query string hasn't been "URL-decoded" yet. Otherwise you will also replace "+"-signs that are supposed to be there.

Url.Parser.parse { url | query = Maybe.map (String.replace "+" "%20") url.query }
CSDUMMI commented 3 years ago

Nonetheless, this does not seem to be an issue with elm/url.

CSDUMMI commented 3 years ago

@EvenAR is this still an issue? Do you have the problem still?

EvenAR commented 3 years ago

I think this issue can be closed. If this way of encoding spaces is specific to Html forms and not part of standard URL encoding, I agree this is not an issue with elm/url. It's not too hard to handle it explicitly when needed.

maxime-didier commented 2 years ago

This issue should be reopened.

The browser is not the only software that implements the URL spec rather than the raw URI RFC. In my case, the URLs that point to my Elm app are generated by a Java web server with the javax.ws.rs.core.UriBuilder class which is standard.

I also cannot correctly workaround this on the Elm side since: