graygnuorg / pound

Light-weight reverse proxy, load balancer and HTTPS front-end for Web servers.
GNU General Public License v3.0
43 stars 13 forks source link

Different behaviour in URL parsing between versions of libpcre? #8

Closed abaldoni closed 1 year ago

abaldoni commented 1 year ago

Hi, we are experiencing an odd behaviour between Pound 2.8 and Pound 4 for one backend used to wrap in HTTPS an HTTP internal server. The URL is written as: URL ".*/AlboOnline.*"

With Pound 2.8, if we go to https://our.site/AlboOnline (no slash at the end), the internal backend server generates a "302 Redirect" to https://our.site/AlboOnline/ then a "302 Redirect" to https://our.site/AlboOnline/ricercaAlbo (no slash at the end) which is the desired page.

With Pound 4, if we go to https://our.site/AlboOnline (no slash at the end), the internal backend server generates a "302 Redirect" to http://our.site/AlboOnline which does not work. Instead, if we go to https://our.site/AlboOnline/ (with slash at the end), the internal backend server generates a "302 Redirect" to http://our.site/AlboOnline/ricercaAlbo which, of course, does not work.

Could this behaviour related to the different versions of libpcre?

nagyrobi commented 1 year ago

Why do you do the redirects on the backend? Wouldn't it be more efficient to do all these within pound?

graygnuorg commented 1 year ago

For what it's worth, URL "./AlboOnline." will not match "https://our.site/AlboOnline", no matter what regex flavor you use (unless the regex library is buggy, of course). That being said, I'd like to notice that in the configuration file you supplied there is no such statement. Instead, there are two occurrences of URL ".*/AlboOnline.*", which will match the above URL.

abaldoni commented 1 year ago

@graygnuorg

Sorry, I didn't enclose the asterisk in the "code" tag. The statement is the one you copied.

abaldoni commented 1 year ago

@nagyrobi

Why do you do the redirects on the backend? Wouldn't it be more efficient to do all these within pound?

Good point: the backend a black box and cannot be modified...

graygnuorg commented 1 year ago

Let me summarize, just to make sure I understand the picture and we're on the same page:

(With pound 2.8)

  1. You make a request to https://our.site/AlboOnline.
  2. The request goes to pound.
  3. Pound catches it using the service with URL ".*/AlboOnline.*" and forwards it to the backend defined in that service.
  4. The backend redirects it to https://our.site/AlboOnline/.
  5. That request goes to pound again.
  6. There it matches another URL ".*/AlboOnline.*", in another service, defined within a ListenHTTPS and gets forwarded to the backend defined in that service.
  7. The backend redirects it to https://our.site/AlboOnline/ricercaAlbo.
  8. It goes to pound again and is forwaded to the same backend as in point 6, which finally serves it.

(With pound 4.6.90)

  1. You make a request to https://our.site/AlboOnline.
  2. The request goes to pound.
  3. Pound catches it using the service with URL ".*/AlboOnline.*" and forwards it to the backend defined in that service.
  4. The backend redirects it to http://our.site/AlboOnline. (the rest seems to be irrelevant)

Do I get it right? Then the first question is: is the service (and, correspondingly, the backend) chosen in point 3 the same as the one selected in point 3 with pound 2.8? If so, what makes the backend redirect the request to another location?

abaldoni commented 1 year ago

@graygnuorg It's almost right. The service which catches URL ".*/AlboOnline.*" is always the same and it is in a ListenHTTPS section. It looks like that the backend server behaves differently between Pound 2.8 and Pound 4. Please note that Pound 2.8 is running on CentOS 6 while Pound 4 on RH 9. The backend is a Tomcat server. I'll try to capture the packets at the backend server to seek out what it is receiving from the two different versions of Pound.

graygnuorg commented 1 year ago

There is a subtle difference in requests forwarded to the backend between pound 2.8 and 4.6.x. Pound 2.8 added only X-Forwarded-For header to the request forwarded to the backend. In contrast, pound 4.6 adds X-Forwarded-For, X-Forwarded-Proto and X-Forwarded-Port. It could be that your backend analyzes one of these (X-Forwarded-Proto seems to be the main suspect) and behaves differently depending on its value.

graygnuorg commented 1 year ago

(See https://github.com/graygnuorg/pound#request-modification and the manpage for details)

abaldoni commented 1 year ago

@graygnuorg You're right! I'm browsing through the packets as I write and it's as you say. I'll look up the documentation and figure out how to handle it. Thanks!

graygnuorg commented 1 year ago

You can use HeaderOption to disable X-Forwarded- headers altogether, or DeleteHeader, to remove only selected headers.

As an unrelated note, I should say that the backend's behavior is pretty strange: while redirecting to https:// in absence of X-Forwarded-Proto may be regarded as consistent, redirecting to plain http:// in presence of X-Forwarded-Proto: https seems completely illogical.

abaldoni commented 1 year ago

@graygnuorg We found out that the Tomcat hosted in the backend is not properly configured and is not honoring the X-Forwarded-Proto header. We are evaluating the change at either the Pound level or the Tomcat level but this definitely clarifies the issue. Many thanks for your help!

johndoe7000 commented 1 month ago

Hello, I'm also hitting this problem.

With Pound 2.8 (compiled with libpcre3-dev) "curl -I https://test.example.com/test" results into...

HTTP/1.1 301 Moved Permanently Date: Mon, 19 Aug 2024 16:25:57 GMT Server: Apache Location: https://test.example.com/test/ Content-Type: text/html; charset=iso-8859-1

Pound 2.8 config: ListenHTTPS Address x.x.x.x Port 443 xHTTP 2 Cert "xxxxxxxxxxxxxxxx" Disable TLSv1_1 SSLHonorCipherOrder 1 Ciphers "xxxxxxxxxxxxxxxxx"

    Service
        Url "/test"
        HeadRequire "Host:.*test.example.com.*"
            BackEnd
                    Address x.x.x.x
                    Port    80
                    TimeOut 60
            End
            Session
                    Type IP
                    TTL 300
            End
    End

....

With Pound 4.12 (commit d657f059c68f27e06ea741d045e7b656191d6ac9, compiled with libpcre2-dev) "curl -I https://test.example.com/test" results into...

HTTP/1.1 301 Moved Permanently Date: Mon, 19 Aug 2024 16:57:14 GMT Server: Apache Location: http://test.example.com/test/ Content-Type: text/html; charset=iso-8859-1

Pound 4.12 config: ListenHTTPS Address x.x.x.x Port 443 xHTTP 2 Cert "xxxxxxxxxxxxx" Disable TLSv1_1 SSLHonorCipherOrder 1 Ciphers "xxxxxxxxxxxxxx"

    Service
        Url "/test"
        Header "Host:.*test.example.com.*"
            BackEnd
                    Address x.x.x.x
                    Port    80
                    TimeOut 60
            End
            Session
                    Type IP
                    TTL 300
            End
    End

....

As you can see the configs are the same... only Header is used instead of deprecated HeadRequire.

The difference between both systems is: Pound 2.8 is installed on Debian 11 and libpcre3. Pound 4.12 is installed on Debian 12 and libpcre2.

I've 4 backends that behave wrong with Pound 4.12 but work flawlessly with Pound 2.8... 2 with Apache and 2 with IIS. All these backends have NO ListenHTTP Redirection. The only solution to this problem is adding a ListenHTTP Service to simply redirect http to https for these backends.