gforcada / haproxy_log_analysis

HAProxy log analyzer
https://pypi.org/project/haproxy_log_analysis
GNU General Public License v3.0
88 stars 35 forks source link

Script is not getting IP address info out of log, just the captured Host header #15

Open elyograg opened 8 years ago

elyograg commented 8 years ago

Version is 2.0b0.

I have haproxy 1.5.12 capturing and logging the Host header, but I have not changed the httplog format at all. This is the added command to capture that header:

capture request header host len 32

So far I've only tried a few commands, such as ip_counter and top_ips. These commands do not report any IP addresses. Instead, they report the info captured from the Host header.

Here's a log line:

May 9 06:54:42 localhost haproxy[47441]: 119.75.230.230:28364 [09/May/2016:06:54:42.377] fe-services-ai-443~ be-services-ai-search-8443/fiesta 261/0/2/346/610 200 6397 - - ---- 60/1/0/0/0 0/0 {services.ai.REDACTED.com} "GET /services/search?set=no&extMeta=no&i=0-100&so=p&fq=sensitive_flag:(0)%20AND%20publish:(1)&s=(%22HISchronologyJ_002_1953%22)&user=REDACTED&password=REDACTED HTTP/1.1"

In the info above, three pieces of information have been replaced with REDACTED -- the Host header info, the username, and the password.

gforcada commented 8 years ago

@elyograg thanks for reporting! Excellent report btw.

On that log line that you report I don't see any IP that could be used... currently haproxy_log_analysis looks for IPs exactly where you have you captured your host, but that's a host not an IP.

If you look at the README on the ip_counter description it already tells you how to capture the IP:

capture request header X-Forwarded-For len 20

elyograg commented 8 years ago

The IP address is 119.75.230.230 ... it's on the line fairly early.

There is not going to be an X-Forwarded-For header in the request. This machine has public IP addresses (though it is behind the firewall) and receives requests directly from the Internet.

gforcada commented 8 years ago

@elyograg Oh I see, I missed that. So far haproxy_log_analysis expects IPs to be on the captured headers section, so relying on an HTTP server to be in front of it. Sorry your use case is not covered right now.

Pull requests are, of course more than welcome.

vixns commented 7 years ago

see #22