I think a part of a previous PR should be rolled back.
There are quite a few sources of information and discussion about the plus sign, which is the main difference between the urlencode and rawurlencode functions, along with their counterparts:
However, currently, when trying to search "a b" results in the page heading showing "Search results for a+b". So rethought the whole encoding issue and split the problem in two:
Interpreting URLs
Generating URLs
At the beginning I thought about following the standard in both cases, but now I think it doesn't make much sense.
Approach when interpreting URLs: Even though I don't like it, Q2A needs to process these two URLs as the same valid URL: https://site.com/user/one+two and https://site.com/user/one%20two. The main issue here becomes forms (as shown in the links above). In short, forms will turn spaces into plus signs by default. So even if it is not following the standard, we need to process them in this way.
Furthermore, $_GET superglobal gets their values already processed by urldecode.
Approach when generating URLs: I don't think there is any need to avoid following a standard when generating URLs. For example, I there is a space in a query string such as in a user profile, it should turn into a %20, rather than a +
Turning this into concrete changes, I'd say when create URLs, we should keep the rawurlencode function calls. When interpreting the URLs, which happens in the index.php file, we should change the current rawurldecode functions to urldecode.
I think a part of a previous PR should be rolled back.
There are quite a few sources of information and discussion about the plus sign, which is the main difference between the
urlencode
andrawurlencode
functions, along with their counterparts:However, currently, when trying to search "a b" results in the page heading showing "Search results for a+b". So rethought the whole encoding issue and split the problem in two:
At the beginning I thought about following the standard in both cases, but now I think it doesn't make much sense.
https://site.com/user/one+two
andhttps://site.com/user/one%20two
. The main issue here becomes forms (as shown in the links above). In short, forms will turn spaces into plus signs by default. So even if it is not following the standard, we need to process them in this way. Furthermore,$_GET
superglobal gets their values already processed byurldecode
.%20
, rather than a+
Turning this into concrete changes, I'd say when create URLs, we should keep the
rawurlencode
function calls. When interpreting the URLs, which happens in theindex.php
file, we should change the currentrawurldecode
functions tourldecode
.