technicalguru / httpd-exporter

Prometheus Metrics exporter for HTTP daemons (Apache, nginx, ...)
Apache License 2.0
5 stars 1 forks source link

Multiple log files in the same instance #3

Open ykyuen opened 5 years ago

ykyuen commented 5 years ago

Previously i got the following configuration

[General]
metricsFile=/var/www/html/httpd-exporter/metrics
addLabels=
addStatusGroupLabel=status
collectBytesTransferred=bytes_sent
retentionSeconds=3600
enableDeadLabels=0
deadLabels=

[LogFormats]
# LC ATG Apache log format
%{NOTSPACE:forward} %{IP:clientip} %{NOTSPACE:rlogname} %{NOTSPACE:user} \[%{HTTPDATE:timestamp}\] %{HOSTNAME:virtualHost} %{NOTSPACE:sslprotocol} %{NOTSPACE:sslcipher} "%{REQUEST_LINE}" %{INT:status} %{INT:bytes_sent} (%{QS:referrer}|-) (%{QS:agent}|-) (%{QS:jsessonid}|-) %{INT:timespent}

[/var/log/httpd/row_prod_estore_access_log]
type=httpd
labels={instance_ip="${HOSTIP}",instance_hostname="${HOSTNAME}"}

[/var/log/httpd/hk_prod_estore_access_log]
type=httpd
labels={instance_ip="${HOSTIP}",instance_hostname="${HOSTNAME}"}

[/var/log/httpd/cn_prod_estore_access_log]
type=httpd
labels={instance_ip="${HOSTIP}",instance_hostname="${HOSTNAME}"}

And the result metrics file looks like the following

# TYPE http_sent_bytes counter
# HELP http_sent_bytes Number of bytes transferred as logged by HTTP daemon
http_sent_bytes{status="5xx"} 75314
http_sent_bytes{status="4xx"} 54023835
http_sent_bytes{status="3xx"} 6284093
http_sent_bytes{status="2xx"} 656063532

# TYPE http_requests_total counter
# HELP http_requests_total Counts the requests that were logged by HTTP daemon
http_requests_total{status="2xx"} 30093
http_requests_total{status="3xx"} 25483
http_requests_total{status="4xx"} 973
http_requests_total{status="5xx"} 8

I thought it is working fine and the result count would be the summation of all 3 log files.

Recently i would like to add a label to separate the count of the 3 log files. So i included the virtualHost label in the "addLabels" general config.

...
addLabels=virtualHost
...

i expect the result metrics file would look like this.

# TYPE http_requests_total counter
# HELP http_requests_total Counts the requests that were logged by HTTP daemon
http_requests_total{status="2xx", virtualHost="Host1"} 10034
http_requests_total{status="2xx", virtualHost="Host2"} 12345
http_requests_total{status="2xx", virtualHost="Host3"} 9434
http_requests_total{status="3xx", virtualHost="Host1"} 25483
http_requests_total{status="3xx", virtualHost="Host2"} 27455
http_requests_total{status="3xx", virtualHost="Host3"} 17641
http_requests_total{status="4xx", virtualHost="Host1"} 973
http_requests_total{status="4xx", virtualHost="Host2"} 1123
http_requests_total{status="4xx", virtualHost="Host3"} 598
http_requests_total{status="5xx", virtualHost="Host1"} 8
http_requests_total{status="5xx", virtualHost="Host2"} 12
http_requests_total{status="5xx", virtualHost="Host3"} 9
...

But it turns out only the Host3 are found

# TYPE http_requests_total counter
# HELP http_requests_total Counts the requests that were logged by HTTP daemon
http_requests_total{status="2xx", virtualHost="Host3"} xxxxx
http_requests_total{status="3xx", virtualHost="Host3"} xxxxx
http_requests_total{status="4xx", virtualHost="Host3"} xxxxx
http_requests_total{status="5xx", virtualHost="Host3"} xxxxx

i am not sure the count value is the summation of all 3 hosts or only Host3.

Any ideas? thanks.

technicalguru commented 5 years ago

Hi Yuen,

can you post an example of your HTTP log file so I can perform some tests? Ideally from all log files as I assume the log formats are your issue and match only in one case.

ykyuen commented 5 years ago

Here is part of the logs from one of the webservers.

80.187.84.95, 104.121.77.28, 88.221.214.94 80.187.84.95 - - [19/Mar/2019:16:31:10 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /apple-touch-icon-precomposed.png HTTP/1.1" 404 179404 "-" "MobileSafari/604.1 CFNetwork/976 Darwin/18.2.0" "fH2UNfx3UEmg8FjtGgCiCZ8u.www-host2-com-4-slave-a" 1049507
196.16.218.241, 104.72.70.147, 60.254.143.137, 23.200.142.45 196.16.218.241 - - [19/Mar/2019:16:31:11 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /product/123?_country=AU HTTP/1.1" 200 45995 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36" "-" 856696
205.185.223.39, 184.27.179.159, 23.55.46.79 205.185.223.39 - - [19/Mar/2019:16:31:10 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /men/123/ HTTP/1.1" 200 56720 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36" "-" 1407253
196.16.217.243, 104.72.70.147, 23.55.47.68 196.16.217.243 - - [19/Mar/2019:16:31:12 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /product/456?_country=AU HTTP/1.1" 200 45743 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36" "-" 522176
80.187.84.95, 104.121.77.28, 88.221.214.94 80.187.84.95 - - [19/Mar/2019:16:31:13 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /apple-touch-icon.png HTTP/1.1" 404 179392 "-" "MobileSafari/604.1 CFNetwork/976 Darwin/18.2.0" "fH2UNfx3UEmg8FjtGgCiCZ8u.www-host2-com-4-slave-a" 845489
127.0.0.1, 23.194.187.214 127.0.0.1 - - [19/Mar/2019:16:31:14 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /akamai/sureroute-test-object.html HTTP/1.1" 200 3965 "-" "-" "-" 2153
209.177.156.77, 23.45.183.94, 104.84.150.15 209.177.156.77 - - [19/Mar/2019:16:31:14 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /women/new/456/?_country=us HTTP/1.1" 200 57482 "-" "NS1" "-" 574348
78.157.219.136, 95.101.143.69, 72.246.231.182, 23.32.20.14 78.157.219.136 - - [19/Mar/2019:16:31:13 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /product/123?_country=GB HTTP/1.1" 200 45664 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36" "-" 1287650
190.123.222.62, 23.36.1.5, 23.32.20.14 190.123.222.62 - - [19/Mar/2019:16:31:14 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /men/new/123 HTTP/1.1" 200 56542 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36" "hYWk7LVNaR-a79w53SEfZLfV.www-host2-com-2-slave-b" 781284
107.175.106.54, 72.246.43.208, 23.54.19.63, 104.84.150.15 107.175.106.54 - - [19/Mar/2019:16:31:16 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /product/789?_country=US HTTP/1.1" 200 45227 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36" "-" 468018

Thanks very much for your help. :pray:

technicalguru commented 5 years ago

Hi Yuen, from the first quick glance I can see that the first part in the logfile (the forward IPs) are not NOTSPACE elements but multiple IP addresses separated by comma and space. This will prevent the line from being parsed and hence counted.

Try to change the format definition as follows:

%{QS:forward} %{IP:clientip} %{NOTSPACE:rlogname} %{NOTSPACE:user} \[%{HTTPDATE:timestamp}\] %{HOSTNAME:virtualHost} %{NOTSPACE:sslprotocol} %{NOTSPACE:sslcipher} "%{REQUEST_LINE}" %{INT:status} %{INT:bytes_sent} (%{QS:referrer}|-) (%{QS:agent}|-) (%{QS:jsessonid}|-) %{INT:timespent}

and change the log format of your webserver to quote the forward IPs, so the line would look like:

"80.187.84.95, 104.121.77.28, 88.221.214.94" 80.187.84.95 - - [19/Mar/2019:16:31:10 +0800] www.host2.com TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /apple-touch-icon-precomposed.png HTTP/1.1" 404 179404 "-" "MobileSafari/604.1 CFNetwork/976 Darwin/18.2.0" "fH2UNfx3UEmg8FjtGgCiCZ8u.www-host2-com-4-slave-a" 1049507

ykyuen commented 5 years ago

Sorry that i didn't realize my log format settings are not matched.

Will make a trial and update you later. thanks very much.

ykyuen commented 5 years ago

It is working now, thx!

i got one more question. In the configuration, we have

labels={instance_ip="${HOSTIP}",instance_hostname="${HOSTNAME}"}

Are the instance_ip and instance_hostname required? i couldn't find these 2 key value pairs in the metrics collected in prometheus.