dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.23k stars 9.95k forks source link

Inconsistent URL decoding of HttpRequest.Path with percent-encoded characters #30655

Open loop-evgeny opened 3 years ago

loop-evgeny commented 3 years ago

Describe the bug

HttpContext.Request.Path seems to decode percent-encoded URLs sometimes, but not all the time. The documentation says nothing about encoding. I expect it to never decode the path, but testing on ASP.NET Core 3.1 I get results like this:

Request URL context.Request.Path Percent decoded?
http://localhost:5000/test%24 /test$ Yes
http://localhost:5000/test%24aa /test$aa Yes
http://localhost:5000/test%25a /test%25a No
http://localhost:5000/test%25aa /test%aa Yes
http://localhost:5000/test%20aa /test%20aa No

To Reproduce

Create an empty ASP.NET Core project and change the Startup class to

    public class Startup
    {
        public void ConfigureServices(IServiceCollection services)
        {
        }

        public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
        {
            app.Use(async (context, next) =>
            {
                Console.WriteLine("Request path = " + context.Request.Path);
                await context.Response.WriteAsync("Request path = " + context.Request.Path);
            });
        }
    }

Program class unchanged from the template:

    public class Program
    {
        public static void Main(string[] args)
        {
            CreateHostBuilder(args).Build().Run();
        }

        public static IHostBuilder CreateHostBuilder(string[] args) => Host.CreateDefaultBuilder(args).ConfigureWebHostDefaults(webBuilder => { webBuilder.UseStartup<Startup>(); });
    }

Then send the above GET requests to it from a web browser or curl.

Further technical details

dotnet --info
.NET SDK (reflecting any global.json):
 Version:   5.0.200-preview.21079.7
 Commit:    580b1c3770

Runtime Environment:
 OS Name:     Windows
 OS Version:  6.1.7601
 OS Platform: Windows
 RID:         win7-x64
 Base Path:   C:\Program Files\dotnet\sdk\5.0.200-preview.21079.7\

Host (useful for support):
  Version: 5.0.3
  Commit:  c636bbdc8a

.NET SDKs installed:
  3.0.100 [C:\Program Files\dotnet\sdk]
  5.0.102 [C:\Program Files\dotnet\sdk]
  5.0.103 [C:\Program Files\dotnet\sdk]
  5.0.200-preview.21079.7 [C:\Program Files\dotnet\sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.All 2.1.25 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.25 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 5.0.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.0.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.25 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.12 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 5.0.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.WindowsDesktop.App 3.1.12 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 5.0.3 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Tratcher commented 3 years ago

Which server are you using (Kestrel, HttpSys, IIS in-proc, IIS out-of-proc)? The servers handle un-escaping the incoming path.

This part of your example is misleading, PathString.ToString will re-encode anything not valid in an encoded url.

Console.WriteLine("Request path = " + context.Request.Path);

To see what the server generated it should be:

Console.WriteLine("Request path = " + context.Request.Path.Value);

Run your tests again and see if you get the expected values now.

Tratcher commented 3 years ago

FYI the docs for the various PathString APIs give more details about the expected encodings.

loop-evgeny commented 3 years ago

Thanks, you're right that context.Request.Path.Value consistently returns the URL-decoded text. I'll use that!

So it looks to me like PathString.ToString() is doing the wrong thing here. new PathString("/%aa").ToString() returns "/%aa", but I would expect it to return "/%25aa", since the value passed to the constructor contains an unescaped '%' and ToString() is supposed to escape it.

loop-evgeny commented 3 years ago

Further, if I construct a PathString with encoded slashes, there seems to be no way to get those back out:

            var pathStr = PathString.FromUriComponent("/one/two%2Fwith%2Fslashes/three");
            Console.WriteLine(pathStr.Value); // Prints "one/two/with/slashes/three", as expeced
            Console.WriteLine(pathStr.ToString()); // Also prints "one/two/with/slashes/three"!
Tratcher commented 3 years ago

Those are certainly more interesting cases to investigate. I know the servers don't use FromUriComponent, they have their own decoder for various reasons, and they avoid un-escaping %2F because of the ambiguity it would create.

The %aa case I'd have to look at, but it's likely due the the fact that %aa resembles a valid escape sequence (but isn't).

ghost commented 3 years ago

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

loop-evgeny commented 3 years ago

@Tratcher OK, but how am I supposed to get the correct request path from HttpContext.Request ? If the client requested "/one/two%2Fwith%2Fslashes/three" how do I get it, either encoded or decoded? I cannot use Path.Value here, because that doesn't distinguish between "/one/two%2Fwith%2Fslashes/three" and "/one/two/with/slashes/three" and I cannot use Path.ToString() either!

Tratcher commented 3 years ago

HttpContext.Request.Path.Value should be correct when working with an actual server (except maybe IIS?). Which servers have you tried?

loop-evgeny commented 3 years ago

Just the built-in one (Kestrel?) - which is what we use in production, too, where the application runs on Linux, so no IIS or http.sys there.

What even is the correct value of Request.Path.Value in this case? If it's supposed to be decoded then "/one/two/with/slashes/three" seems technically correct - but useless. Useful outputs would be:

Tratcher commented 3 years ago

Kestrel is supposed to decode all valid escape sequences except %2F because of the ambiguity it creates. %2F can only be decoded after splitting the path into segments like you've shown.

loop-evgeny commented 3 years ago

Oh, I see, if I actually make an HTTP request then yes, it does what you say:

Request URL: localhost:5100/one/two%2Fwith%2Fslashes%20and%25percent/three Path.ToString(): "/one/two%2Fwith%2Fslashes%20and%25percent/three" Path.Value = "/one/two%2Fwith%2Fslashes and%percent/three"

So then I'm supposed to do something like context.Request.Path.Value.Split('/').Select(p => p.Replace("%2F", "/")) to get the fully-decoded parts? That should work, but is this "almost decoding" behaviour documented somewhere?

Tratcher commented 3 years ago

I doubt that's well documented. I'm not sure where to document that to make it discoverable. The PathString API docs are likely the best option.

loop-evgeny commented 3 years ago

The documentation for HttpRequest.Path is where I was looking - it would certainly be helpful to have it there.

Anyway, thank you for your help - I'll try using the code snippet above.

schmitch commented 2 years ago

btw. it would be great for a lot of stuff to have the undecoded path (even for your own stuff, like yarp)

ghost commented 2 years ago

Thanks for contacting us. We're moving this issue to the .NET 7 Planning milestone for future evaluation / consideration. Because it's not immediately obvious that this is a bug in our framework, we would like to keep this around to collect more feedback, which can later help us determine the impact of it. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

GREsau commented 2 years ago

It is actually possible to get the raw original path with httpContext.Features.Get<IHttpRequestFeature>()?.RawTarget (RawTarget). However this does also include the full query string, and as the docs say, the target may not always be a path (e.g. for server-wide OPTIONS requests).

Examples from a Kestrel app: Request URL Request.Path.Value Request.Path.ToString() IHttpRequestFeature.RawTarget
http://localhost:5000/a/b%20c /a/b c /a/b%20c /a/b%20c
http://localhost:5000/a%2fb%20c /a%2fb c /a%2fb%20c /a%2fb%20c
http://localhost:5000/a%252fb%20c /a%2fb c /a%2fb%20c /a%252fb%20c
Edit: you'd also need to be careful around dots in the path, as this can lead to path traversal attacks: Request URL Request.Path.Value Request.Path.ToString() IHttpRequestFeature.RawTarget
http://localhost:5000/a/./../... /... /... /a/./../...
http://localhost:5000/a/%2e/.%2e/..%2e /... /... /a/%2e/.%2e/..%2e
Tratcher commented 1 year ago

https://github.com/dotnet/aspnetcore/pull/44386 covers the docs part of this. Removing the docs tag since we'll eventually need feature work to fix this.

mbrenn commented 1 year ago

Is the issue related to the following stackoverflow question? https://stackoverflow.com/questions/75085889/asp-net-onget-url-encoded-parameters-only-some-parameters-are-decoded-by-asp

I have major struggle to have a url-encoded string with '=' as part of a Querystring.

Tratcher commented 1 year ago

@mbrenn this issue is specific to the path and %2F. Please open a new issue with the details for your query issue.