thomasw / djproxy

djproxy is a class-based generic view reverse HTTP proxy for Django.
MIT License
42 stars 15 forks source link

reverse_urls help #28

Closed vdveldet closed 5 years ago

vdveldet commented 5 years ago

Trying to use ReverseProxy on Django 2.1 but is does not add the location headers. Later I see that it is only tested on Django 1 version. Will this project get into a Version 2 release ?

thomasw commented 5 years ago

I'll look into it today. I know of at least one consumer using HttpProxy with 2.0.6, seemingly without issue. However, they are not using the reverse_urls functionality. Can you post your proxy config while I work on reconfiguring the tests to use modern Django versions?

thomasw commented 5 years ago

Just a quick update: After making some minor fixes to the test suite itself, all tests are passing under django 2.1 and 2.0. That doesn't necessarily mean that the functionality you're trying to use works as intended, but it's a good sign that there isn't something fundamental broken.

thomasw commented 5 years ago

I can confirm that this functionality is working in Django 2.x. If you post your Proxy configuration, I can help you debug it.

vdveldet commented 5 years ago

The proxy is working to get the initial page but all requests to css and js are not handled,

urls.py


from django.contrib import admin
from django.urls import path
from django.conf.urls import include, url

from djproxy.urls import generate_routes
from djproxy.views import HttpProxy

class ReverseProxy(HttpProxy):
    base_url = 'http://192.168.0.11/'
    reverse_urls = [
        ('/proxy/', 'http://192.168.0.11/')
    ]

urlpatterns = [
    path('admin/', admin.site.urls),
    url(r'^proxy/(?P<url>.*)$', ReverseProxy.as_view(), name='gproxy')
]
thomasw commented 5 years ago

Can you provide the request/response headers for one of the CSS or JS assets?

thomasw commented 5 years ago

Also, if you're expecting content rewriting, https://github.com/thomasw/djproxy/issues/18#issuecomment-163078001 might be helpful.

vdveldet commented 5 years ago

The site is a basic netgear router, so nothing fancy CSS is referenced as the example below.

The problem is that the ReverseProxy does not add the Location header when send to the Client so you see it hitting in the console as shown below ...

web_1  | [03/Dec/2018 17:24:39] "GET /proxy/ HTTP/1.1" 200 14657
web_1  | Not Found: /open_auth/logon.css
web_1  | [03/Dec/2018 17:24:39] "GET /open_auth/logon.css HTTP/1.1" 404 2135
web_1  | Not Found: /open_auth/icon_close.png
web_1  | Not Found: /open_auth/back-top-left.png
web_1  | [03/Dec/2018 17:24:39] "GET /open_auth/icon_close.png HTTP/1.1" 404 2150
web_1  | [03/Dec/2018 17:24:39] "GET /open_auth/back-top-left.png HTTP/1.1" 404 2159
web_1  | Not Found: /open_auth/alert-bottom.png
web_1  | [03/Dec/2018 17:24:39] "GET /open_auth/alert-bottom.png HTTP/1.1" 404 2156
web_1  | Not Found: /open_auth/help.png
web_1  | [03/Dec/2018 17:24:39] "GET /open_auth/help.png HTTP/1.1" 404 2132
web_1  | Not Found: /open_auth/bottom-right.png
web_1  | Not Found: /open_auth/logo.png
web_1  | [03/Dec/2018 17:24:39] "GET /open_auth/bottom-right.png HTTP/1.1" 404 2156
web_1  | [03/Dec/2018 17:24:39] "GET /open_auth/logo.png HTTP/1.1" 404 2132
web_1  | Not Found: /favicon.ico
web_1  | [03/Dec/2018 17:24:39] "GET /favicon.ico HTTP/1.1" 404 2111
thomasw commented 5 years ago

reverse_urls doesn't result in headers being added, just the modification of existing headers. If there is no header to begin with, nothing is changed. The use case for this functionality is, primarily, fixing broken redirects which might otherwise result in a client breaking out of its proxy. It doesn't do what you're expecting it to do. See https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypassreverse.

Based on those logs, the content you are proxying is not proxy-friendly (it doesn't use relative asset references in its content). In those cases, you need to either modify your proxy configuration (the url pattern) such that it captures all content-referenced paths and routes them upstream or use the proxy middleware functionality to rewrite the content so that the asset references are prefixed with /proxy/. This is discussed in #18.

vdveldet commented 5 years ago

The problem was indeed to rewrite the HTML. For Django 2.1.4 on Python 3 I used the code below, readable but not good python code ;-) as a Middelware class.

class RewriteHTML(object):

    def process_response(self, proxy, request, upstream_response, response):
        """Modify the HttpResponse object before sending it downstream.

         # change only html responses
        if 'text/html' not in response['Content-Type']:
            return response

        string_response = response.content.decode('utf8')
        reserve_byte = reverse(request.resolver_match.url_name, kwargs={'url': ''})

        response.content = re.sub(r'''((?:src|href)=['|"]?)(/)''', r'\1'
                                  + reserve_byte + r'\2',
                                  string_response, flags=re.IGNORECASE)

        return response
thomasw commented 5 years ago

Awesome, I'm glad it worked out.