Pylons / pyramid

Pyramid - A Python web framework
https://trypyramid.com/
Other
3.95k stars 882 forks source link

URLDecodeError on accessing not existing website path #3726

Closed DoINeedThis closed 8 months ago

DoINeedThis commented 1 year ago

Get Support

To get help or technical support, see Get Support.

Bug Report

Please search the issue tracker for similar issues before submitting a new issue.

Describe the bug When trying to access "/%c0%ae/%c0%ae/WEB-INF/web.xml" expecting an 404 Error but instead encountering an Unhandled Exception with the underlying problem of "pyramid.exceptions.URLDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 1: invalid start byte".

To Reproduce Steps to reproduce the behavior:

  1. Run Docker container from archive below Issue.zip
  2. Run with localhost:8080/%c0%ae/%c0%ae/WEB-INF/web.xml in your browser

Expected behavior Webbrowser should state an 'Unhandled Exception' and when accessing '/var/log/apache2/error.log' an URLDecodeError should be documented there.

Screenshots I expected this: expected But got this: unexpected

Additional context Add any other context about the issue here.

luhn commented 1 year ago

This looks intentional, there's even a unit test validating the behavior.

https://github.com/Pylons/pyramid/blob/d3fc4f97e8d5b8553c2a40da535a4b82d41d8ea1/tests/test_urldispatch.py#L132-L144

I think the idea behind it is that an incorrectly encoded URL cannot be checked against the defined routes, so you can't say it's Not Found.

If you really want a 404, you can add an exception view.

@view_config(context=URLDecodeError, renderer="404.html")
def url_decode_error_view(context, request):
    request.response.status_code = 404
    return {}
mmerickel commented 8 months ago

I think Pyramid should install a default exception view for this exception. I just can't think of a downside. That being said, the default exception view would return a 400 Bad Request, not a 404.

mmerickel commented 8 months ago

So the problem with Pyramid doing anything right now is that it's not possible to fully complete it. URLDecodeError currently is only raised by Pyramid for issues with the path. And specifically only the environ['PATH_INFO']. There is no standard exception for other cases. For example:

There have been open tickets forever in webob to try to standardize this better, but it's just not there yet. So if Pyramid started handling URLDecodeError I'm worried it'd be a little deceptive. It'd properly handle a bad path, but later when you do request.GET, this will not result in a URLDecodeError. Until you actually try to parse the query string, you do not know what encoding it is, as there's no standard. And there's no requirement that every part of the URL is utf8 to my knowledge. In webob it's defined by Request.url_encoding which of course defaults to utf-8 but you can change it if you define a custom IRequestFactory.

In my apps I'm comfortable adding an exception view for UnicodeDecodeError to capture all of the issues, but this isn't a generally applicable solution at this point. To do it correctly you really want to know which part of the URL is invalid, whether it's the script_name, path_info, or query_string. A general UnicodeDecodeError would inadvertently capture issues decoding the body as well, or headers - all of which likely should be handled differently by an app.

mmerickel commented 8 months ago

I'm closing this as a duplicate of https://github.com/Pylons/pyramid/issues/1374.