matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.3k stars 2.59k forks source link

IP::getNonProxyIpFromHeader retrieves final proxy instead of client #7060

Open jcracknell opened 9 years ago

jcracknell commented 9 years ago

IP::getNonProxyIpFromHeader attempts to retrieve the client IP address from headers configured in proxy_client_headers[]. This calls IP::getLastIpFromList, excluding proxies configured via proxy_ips[].

What I do not understand is why by default this returns the last IP, whereas the format for X-Forwarded-For is client, proxy1, proxy2, ...: http://en.wikipedia.org/wiki/X-Forwarded-For#Format

This only becomes an issue when running Piwik behind multiple proxies; for example the configuration in question is:

[Enterprise Appliance] => [IIS ARR] => [Piwik]

So Piwik sees:

X-Forwarded-For: <client>, <enterprise_appliance>

Basically the current behavior would seem to select the IP of the last proxy by default. This would be problematic in a scenario with variable proxy IPs.

tsteur commented 9 years ago

Thx for the report!

jcracknell commented 9 years ago

I thought about this some more, and the current behavior of proxy_ips[] works something like nginx's notion of trusted proxies I ran into while flailing around with this issue. The problem is that it is not documented as such, and the behavior falls apart because it accepts an X-Forwarded-For address regardless of whether or not it was provided by a member of proxy_ips[].

I think the behavior should perhaps check if the direct client ($_SERVER['REMOTE_ADDR']) is in proxy_ips[] and only pull ips from proxy_client_headers[] if this is the case, taking the last non-proxy IP in the list. This lets you accept both proxied and unproxied requests.

The downside is this would make proxy_ips[] required configuration for all proxied installs. You could of course put this behavior behind a configuration flag, e.g. proxy_trust_required.

djschoone commented 9 years ago

I think you would not need the proxy_trust_required (e.d.) flag, because it's possible to check if the array of proxy_ips has values.

As seen in function getNonProxyIpFromHeader() in core/IP.php the $default (which is in most cases $_SERVER['REMOTE_ADDR'] is added to the list of proxy_ips so that way it should never be returned by getLastIpFromList

How can I help, to get this issue resolved? I'm currently implementing Piwik behind multiple proxies so we are in the blind of the users real IP at this moment.

mattab commented 9 years ago

How can I help, to get this issue resolved?

I didn't check the details of this issue, but it would help if you manage to create a Pull request and explain in this PR what you do, why, what problem it solves, etc. Cheers

robocoder commented 6 years ago

To clarify, this works as expected/designed.

Each successive proxy appends to the Forwarded-For header. As the OP observes:

X-Forwarded-For: <client>, <enterprise_appliance>

proxie_ips should contain the IPs for the proxy and enterprise_appliance, so, IP::getLastIpFromList filters it out, and gets the client IP.

The reason why we traverse the list of IPs in X-Forwarded-For from right-to-left is because we don't automatically trust anything at the beginning of the header (e.g., it could be forged).

mattab commented 6 years ago

Thanks for the details @robocoder Therefore it seems we could close this issue with "Wontfix" / Works as expected. Unless one can prove there is really a problem/limitation with the current logic?

djschoone commented 6 years ago

Thanks for the follow up. I cannot recall how we we're able to resolve the issue. We somehow managed it, probably by changing the request header done by the proxy server.