Closed giammbo closed 7 years ago
I guess you have to post the query you are using in grafana / chronograf!?
the problem it's present when insert a data with a lot of characters like a user agent
Hi,
I've got the same behaviour with postgresql metrics (I retreived sql queries too) and since last update it's really painful.
Surpisingly I don't have that much io consumption, influx is only eating CPU and memory
@giammy2290 @menardorama this issue needs more info.
Now i can't post a small test because influx it's in the production env but:
1) Kernel: Linux 3.13.0-61-generic, OS: Ubuntu 14.04.3 LTS \n \l InfluxDB: 0.10.1-1 2) for the insert query i use fluentd with this config:
<source>
type tail
path /log/admin/access.log
pos_file /log/td-agent/admin.access.log.pos
format /^(?<daemon>[^ ]*)\s*(?<server>[^ ]*)\s*(?<host>[^ ]*)\s*(?<htaccess1>[^ ]*)\s*(?<htacces2>[^ ]*)\s*\[(?<date>.*?)\]\s*\"(?<httprequest>[^ ]*)\s*(?<url>[^ ]*)\s*(?<httpcode>[^ ]*)\"\s*\"(?<postrequest>[^ ]*)\"\s*(?<httpstatus>[^ ]*)\s*(?<pagesize>[^ ]*)\s*\"(?<referrar>[^ ]*)\"\s*\"(?<useragent>.*?)\"\s*(?<renderizepage>.*)$/
tag nginx.access.log
</source>
<match nginx.access.log.**>
type influxdb
host pippo.int
port 8086
dbname access_log
user access
password access
use_ssl false
time_precision s
tag_keys ["daemon", "server", "host", "htaccess1", "htaccess2","date","httprequest","httpcode", "postrequest","httpstatus","pagesize","referrar", "useragent","rengerizepage" ]
sequence_tag _seq
flush_interval 10
retry_limit 1
</match>
this is a one example log:
nginx: 10.0.4.141 2.5.205.201 - - [04/Mar/2016:10:29:48 +0000] "GET /fr/validation/messages HTTP/1.1" "-" 200 580 "https://www.musement.com/fr/amsterdam/musee-van-gogh-tickets-avec-acces-prioritaire-651/" "Amazon CloudFront" 0.091 0.091 .
if you start to send the log (i receive 5-6 log for second) my cpu and my memory start to increase. it's not a fluentd problem, this for sure.
(iìm writing from my phone, check the regexp before, sorry)
@giammy2290
How many of the below fields are sitting as Tag's in your measurement.
tag_keys ["daemon", "server", "host", "htaccess1", "htaccess2","date","httprequest","httpcode", "postrequest","httpstatus","pagesize","referrar", "useragent","rengerizepage" ]
My guess is all the fields from the access logs are sitting as fields in measurement hence you are seeing a spike when you try accessing data from grafana or chronograf.
P.S. only tags are indexed in influx and not fields.
i use this "htaccess1" or "htaccess2"
@giammy2290
i have modified the reg exp like this:
format /^(?<daemon>[^ ]*)\s*(?<server>[^ ]*)\s*(?<host>[^ ]*).*?"(?<httprequest>[^ ]*)\s*.*?\s(?<protocol>[^ ]*)\"\s*\".*?\"\s*(?<statuscode>[^ ]*).*$/
now i have daemon like TAG, and the other like fields:
i have set grafana with this:
and the results are this:
@giammy2290 from the pics you have attached, i see deamon as a field and not as a tag.Can you please put "daemon,server,host,htaccess1,htaccess2" as tags.
i cant because my influx is in production env, i cant change the regexp, but i had missed a write, my tag keys are: server, host, httprequest, protocol, statuscode,
Hi all,
i'm using influxdb for save the access log. I'm trying to send 133M of file with 498658 line (the classic nginx access log), the data it's stored with fluentd, when i try to use grafana or chronograf influxdb take a lot of memory and CPU why?