ThreeMammals / Ocelot

.NET API Gateway
https://www.nuget.org/packages/Ocelot
MIT License
8.24k stars 1.62k forks source link

`Regex` exception when route template placeholder contains invalid character ')' #2116

Open wenneberg6 opened 1 week ago

wenneberg6 commented 1 week ago

Expected Behavior

Given a route parameter including a ) character, then a 4** status code is expected.

Earlier versions before 19.0.4 returned status code 404.

Actual Behavior

Exception which causes a response with status code 500.

StackTrace: Exception caught in global error handler, exception message: Invalid pattern '\b*path=debug)\b' at offset 14. Too many )'s., exception stack: at System.Text.RegularExpressions.RegexParser.ScanRegex() at System.Text.RegularExpressions.RegexParser.Parse(String pattern, RegexOptions options, CultureInfo culture) at System.Text.RegularExpressions.Regex..ctor(String pattern, CultureInfo culture) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.RemoveQueryStringParametersThatHaveBeenUsedInTemplate(DownstreamRequest downstreamRequest, List1 templatePlaceholderNameAndValues) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.Invoke(HttpContext httpContext) at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext) at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext) at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext) at Ocelot.RateLimit.Middleware.ClientRateLimitMiddleware.Invoke(HttpContext httpContext) at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext) at Ocelot.Multiplexer.MultiplexingMiddleware.Fire(HttpContext httpContext, RequestDelegate next) at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext) RequestId: 0HN4SP4CA0UMM:00000001, exception: System.Text.RegularExpressions.RegexParseException: Invalid pattern '\bpath=debug)\b' at offset 14. Too many )'s. at System.Text.RegularExpressions.RegexParser.ScanRegex() at System.Text.RegularExpressions.RegexParser.Parse(String pattern, RegexOptions options, CultureInfo culture) at System.Text.RegularExpressions.Regex..ctor(String pattern, CultureInfo culture) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.RemoveQueryStringParametersThatHaveBeenUsedInTemplate(DownstreamRequest downstreamRequest, List`1 templatePlaceholderNameAndValues) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.Invoke(HttpContext httpContext) at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext) at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext) at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext) at Ocelot.RateLimit.Middleware.ClientRateLimitMiddleware.Invoke(HttpContext httpContext) at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext) at Ocelot.Multiplexer.MultiplexingMiddleware.Fire(HttpContext httpContext, RequestDelegate next) at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext) System.Text.RegularExpressions.RegexParseException: Invalid pattern '\bpath=debug)\b' at offset 14. Too many )'s. at System.Text.RegularExpressions.RegexParser.ScanRegex() at System.Text.RegularExpressions.RegexParser.Parse(String pattern, RegexOptions options, CultureInfo culture) at System.Text.RegularExpressions.Regex..ctor(String pattern, CultureInfo culture) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.RemoveQueryStringParametersThatHaveBeenUsedInTemplate(DownstreamRequest downstreamRequest, List1 templatePlaceholderNameAndValues) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.Invoke(HttpContext httpContext) at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext) at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext) at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext) at Ocelot.RateLimit.Middleware.ClientRateLimitMiddleware.Invoke(HttpContext httpContext) at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext) at Ocelot.Multiplexer.MultiplexingMiddleware.Fire(HttpContext httpContext, RequestDelegate next) at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext)

Steps to Reproduce the Problem

  1. Simple Route configuration { "DownstreamPathTemplate": "/api/{path}", "DownstreamScheme": "http", "DownstreamHostAndPorts": [{...}], "UpstreamPathTemplate": "/api/{path}", "UpstreamHttpMethod": [ "Get" ] }
  2. GET /api/debug)
  3. Or with urlencoded parameter: GET /api/debug%29

Specifications

raman-m commented 1 week ago

Hello Jimmi!

"UpstreamPathTemplate": "/api/{*path}", DownstreamPathTemplate": "/api/{*path}",

The templates are invalid due to the inclusion of the * character in the placeholder name. What is the reason for adding an asterisk before the placeholder name?

GET /api/debug)

Indeed, using an invalid character in the URL string is not advisable. It's hard to see its purpose. Perhaps you're a QA engineer who enjoys testing software? 😉

Or with url encoded parameter: GET /api/debug%29

This URL is valid; however, due to special regex characters, the System.Text.RegularExpressions.RegexParser.ScanRegex() identifies it during the route matching phase, and the internally constructed regex string likely has unbalanced parentheses. It appears that any special RegEx characters in a URL could potentially cause this exception.

exception message: Invalid pattern '\b*path=debug)\b' at offset 14. Too many )'s.

Firstly, please remove the * from the placeholder name and test the URL again. Secondly, the rationale behind the upstream /api/{path} to downstream /api/{path} conversion is unclear. Could you explain why this is necessary?

I'm requesting a test of this route definition with a single Catch All route:

{
  "DownstreamPathTemplate": "/{path}",
  "UpstreamPathTemplate": "/{path}"
}

This route template should handle all paths, whether they include special characters encoded or not. Please provide feedback.

raman-m commented 1 week ago

Expected Behavior

Given a route parameter including a ) character, then a 4** status code is expected.

Earlier versions before 19.0.4 returned status code 404.

Actual Behavior

Exception which causes a response with status code 500.

I'm curious... How does this affect your client? Both statuses indicate failure. What is the reaction of your downstream service to this URL /api/debug%29, and what status does it return?

wenneberg6 commented 1 week ago

Thanks for replying. Nope, not QA :) But since our latest update from v.18.0.0 to v.23.2.2 we noticed these Regex exceptions in our logs. Ocelot has changed behavior, it used to accept routes with unknown characters. I tested the latest versions and found that in v.19.0.4 Ocelot started to throw an exception.

I have removed the * and tested again, no difference, an exception is still thrown.

The Downstream service is responsible for returning content and providing the correct status code when nothing is found(404). In our case, an invalid route should return a 404, which is not considered a failed response.

Ocelot prevents this from happening, as the Regex exception causes a 500 server error to be returned upstream. Adding a parenthesis to the url is not intended by our users, or bots, but it is a public endpoint, stuff happens.

This problem can be "handled" using other techniques. But it is a change in behavior, that i was not aware of reading the change logs, and therefor must consider a bug.

raman-m commented 6 days ago

@wenneberg6 commented on July 9

But since our latest update from v.18.0.0 to v.23.2.2 we noticed these Regex exceptions in our logs. Ocelot has changed behavior, it used to accept routes with unknown characters. I tested the latest versions and found that in v.19.0.4 Ocelot started to throw an exception.

Upon reviewing the code, I found that the Regex causing the exception was introduced in version 8.0.1, as seen at line 82 in the commit https://github.com/ThreeMammals/Ocelot/commit/8f4ae03290ce4621821396c0baf4d9c227fc8c38. Migrating from version 18.0.0 to 23.2.2 carries significant risks due to the extensive changes. Although we strive for backward compatibility, behavior may vary between major versions.

I tested the latest versions and found that in v.19.0.4 Ocelot started to throw an exception.

How did you determine there was an issue with version 19.0.4? As mentioned, the RegEx was introduced in version 8.0.1. How were you able to test version 19.0.4 if you had migrated from 18.x to 23.x?

The Downstream service is responsible for returning content and providing the correct status code when nothing is found(404). In our case, an invalid route should return a 404, which is not considered a failed response. Ocelot prevents this from happening, as the Regex exception causes a 500 server error to be returned upstream.

The downstream service returned a 404 status, indicating that Ocelot successfully forwarded the request without encountering internal 500 errors. We will address the issue.

This problem can be "handled" using other techniques. But it is a change in behavior, that i was not aware of reading the change logs, and therefor must consider a bug.

We are taken aback, yet we accept it. This is the bug.

wenneberg6 commented 6 days ago

Thank you. Yes updating was a risk, but reading through the release notes, we found no issues for our configuration.

How were you able to test version 19.0.4 if you had migrated from 18.x to 23.x?

I rolled back from 23.3.2 to 1.8.0. I confirmed that in v.18 a status code 404 was returned for my use case. I updated the Ocelot Nuget package one version at a time, upon reaching v.19.0.4 the status code changed to 500.

Perhaps my testing is flawed, but i found it to fail from v.19.0.4.

raman-m commented 6 days ago

Jimmi, I apologize for the confusion, but there is no 19.0.4 release. Our releases include 19.0.3 and the subsequent 20.0.0, which you can view on page 2. However, there is a 19.0.4 package available. It was issued as a technical release. Perhaps you are referring to version 20.0.0, correct? And release 20.0.0 was pretty big. Changes: 19.0.3...20.0.0 Possible root cause: https://github.com/ThreeMammals/Ocelot/commit/ec85b132a06b239faa499777820a9cc0143bb1d0https://github.com/ThreeMammals/Ocelot/commit/ec85b132a06b239faa499777820a9cc0143bb1d0#diff-f5a2c3a941a7f201a4cead9eb4f489229f0740cf4e1166d0b8ff1670e2ce30a5R100-R117 More details in bug fixing PR #1335