slimphp / Slim

Slim is a PHP micro framework that helps you quickly write simple yet powerful web applications and APIs.
http://slimframework.com
MIT License
11.94k stars 1.95k forks source link

Route arguments get decoded twice #3208

Closed assertio-dani closed 4 months ago

assertio-dani commented 2 years ago

Hi!

The arguments get decoded twice, in:

That's a problem if the argument contains a "decodable" combination of characters.

Let's say I have an article with id 9%65. The route to retrieve it is /articles/{id}.

If I rawurlencode the article id (i.e. /articles/9%2565), since it gets decoded twice, the argument returned is 9e instead of 9%65.

In order to get the proper argument, I'd need to rawurlencode the article id twice (i.e. /articles/9%252565)

Please advise

odan commented 2 years ago

I have tried to reproduce your posted issue. Here are my test results using postman:

GET /articles/123 Response: 123

GET /articles/9%65 Response: 9e Single encoding: 9e (same)

GET /articles/9%2565 Response: 9e Single encoding should be: 9%2565 (not the same)

GET /articles/9%252565 Response: 9%65 Single encoding should be: 9%2565 (not the same)

Minimal code example:

<?php

use Psr\Http\Message\ResponseInterface as Response;
use Psr\Http\Message\ServerRequestInterface as Request;
use Slim\Factory\AppFactory;

require __DIR__ . '/../vendor/autoload.php';

$app = AppFactory::create();

$app->get('/articles/{id}', function (Request $request, Response $response, $args) {
    $response->getBody()->write($args['id']);
    return $response;
});

$app->run();

Generally, I would recommend to use only characters in a range from [a-zA-Z0-9\-] for route arguments. For complex arguments, a query-string might be a better option.

assertio-dani commented 2 years ago

Hi!

Thank you for taking the time to investigate it.

The problem I see with double decoding is that the article ID is 9%65, I mean, that's literally the id stored in the DB.

Then, how can we refer to it in the URL?

I guess the approach should be to url-encode it (e.g. GET /articles/9%2565), but since it's decoded twice, the ID received in the controller is 9e, not 9%65.

For sure it's an edge case. It's definitely not a good idea to use % characters in the id, but... that's how my client articles are stored, and I don't manage that data.

I don't know the intricacies of Slim, but as an outsider opinion, shouldn't it be enough decoding the URL once in \Slim\Routing\RouteResolver::computeRoutingResults ? Doing so, both URL arguments and query params would be decoded correctly, allowing both GET /articles/9%2565 and GET /articles?id=9%2565.

odan commented 2 years ago

The first url-encoding happens before that actual routing. See RouteResolver. This is needed for the FastRouter dispatcher and a correct behavior.

So this: /articles/9%2565 becomes this /articles/9%65

Then, when Slim reads the route arguments, the values will be decoded a second time here:

https://github.com/slimphp/Slim/blob/4.x/Slim/Routing/RoutingResults.php#L99

I don't know the reasons behind it, but maybe this is a bug?

As a quick workaround, you could pass a - instead of % and replace - to % within a middleware.

Example URI path: /articles/9-65. Then use $id = str_replace('-', '%', $args['id']);

l0gicgate commented 2 years ago

We have to do a rawurldecode() before dispatching the URI to the router otherwise it would break when encountering special chars.

If you use getRouteArguments() you can optionally pass false the the method so rawurldecode() isn't called a second time.

I have to say I'm not entirely sure why that's there in the first place, this is probably an artifact from refactoring the routing system from 3.x to 4.x.

I don't know what the correct solution is here. @akrabat thoughts?

akrabat commented 4 months ago

I think that passing false to getRouteArguments() is the correct solution for this specific case as urldecoding the route parameters dates back to 3.0.0 in https://github.com/slimphp/Slim/commit/a52054e28c89e36b24c9c7610a84112e3eb15927#diff-b32b557fdb363965da2c6ac3c0904d63358bb3aa0db9cfab2f2d6dccedfac3f9. The reasoning behind it is lost to time though.

We can't change that in 4.x without breaking BC, so it'd have to be a 5.x thing,