Closed candlerb closed 6 years ago
STRUCTURED-DATA is documented, but it could be clarified with an example. I set up another test, using an example from RFC5424:
# echo '173 <165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"] An application event log entry...' | nc 127.0.0.1 6514
Result:
> select * from syslog where "appname" = 'evntslog' limit 1
name: syslog
time appname exampleSDID@32473_eventID exampleSDID@32473_eventSource exampleSDID@32473_iut facility facility_code host hostname message msgid procid severity severity_code timestamp version
---- ------- ------------------------- ----------------------------- --------------------- -------- ------------- ---- -------- ------- ----- ------ -------- ------------- --------- -------
2018-09-27T10:37:13.151962895Z evntslog 1011 Application 3 local4 20 tick mymachine.example.com An application event log entry... ID47 notice 5 1065910455003000000 1
telegraf has created new field keys:
> show field keys from syslog
name: syslog
fieldKey fieldType
-------- ---------
exampleSDID@32473_eventID string
exampleSDID@32473_eventSource string
exampleSDID@32473_iut string
facility_code integer
message string
msgid string
procid string
severity_code integer
timestamp integer
version integer
There is one other thing worth mentioning in the documentation. If you follow the example to get rsyslog to forward logs to telegraf on localhost:
# forward over tcp with octet framing according to RFC 5425
*.* @@(o)127.0.0.1:6514;RSYSLOG_SyslogProtocol23Format
... you should beware that in a default Ubuntu 16.04 installation, influxdb writes query logs to stderr, which are picked up by systemd and written to syslog. You can see them in /var/log/syslog
as well as journalctl -xe _SYSTEMD_UNIT=influxdb.service
Sep 27 11:32:30 tick influxd[386]: [httpd] 127.0.0.1 - - [27/Sep/2018:11:32:30 +0000] "POST /write?db=telegraf HTTP/1.1" 204 0 "-" "telegraf" 0494c172-c249-11e8-b2aa-000000000000 13481
Those logs go to rsyslogd, which forwards them to telegraf, which writes them to influxdb... you get the picture.
Oddly, this doesn't cause the explosion of logs I was expecting, but I still haven't worked out why - maybe there is buffering/batching. Still: it might be wise to point out the possible risk.
Thanks @candlerb, these are some good observations that will really help getting started with this plugin.
maybe there is buffering/batching.
Telegraf writes in batches of metric_batch_size
which is 1000 by default.
As mentioned on #4762, the host
tag is a tag we set on all metrics by default, and represents the host running Telegraf. I'll comment further on the other issue on what we should do with hostname.
We will improve the structured data and timestamp documentation as suggested.
I added documentation about timestamp vs timestamp field and structured data to the syslog readme.
I was very happy to see the inclusion of a syslog plugin. However the documentation is missing some critical information, in particular:
hostname
field contain the source IP address of the syslog message, or the hostname contained within the message?The timestamp issue is important because (a) many devices generate syslog messages containing only 1-second resolution timestamps, and (b) influxdb can only store one data point in a timeseries for a particular timestamp.
So I set out to find out by experimentation what happens. I used the sample config which receives TCP on port 6514, and the sample rsyslogd configuration which forwards to it. Then I did:
tcpdump on port 6514 showed the following messages generated by rsyslogd:
And the results in the database:
From this I infer:
timestamp
field is the time contained within the messageTherefore, unless two messages are processed in the same nanosecond (unlikely), there should be no loss of messages. However any buffered or backlogged messages will be stored and ordered by time of reception.
This all seems fine to me, but it would have been very helpful if this had been documented.
The tags and fields are more-or-less in agreement with the documentation:
However, the documentation says there is a tag "hostname", but I also see "host". So to find the difference between these, I did another test:
Result:
So it appears that the
hostname
is the hostname contained within the message. But what abouthost
? Is it the source address of the syslog message?The answer is no. I tried the same nc test from a different host (giving the IP of telegraf instead of 127.0.0.1), and got the same result.
So it seems the host is just the host where telegraf itself is running. Again, would be very helpful if this was documented.