influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.62k stars 5.58k forks source link

Extend socket_listener input plugin to write data to and to preprocess read data from sockets #4984

Closed mjf closed 1 year ago

mjf commented 5 years ago

Feature Request

Extend socket_listener input plugin to write data to sockets on connect so that control sequences or commands could be sent to the sockets to invoke response to gather data to be collected, preprocessed and parsed, primarily for use with interactive or non-standard interfaces.

Proposal:

Allow for the following (or similar) configuration options to the socket_listener input plugin:

[[inputs.socket_listener]]

  service_address = "tcp://host.domain.tld:7505"  # OpenVPN management as an example

  # ...

  write_data_text = "status\nquit\n"  # write data in textual
  write_data_base64 = "c3RhdHVzXG5xdWl0XG4="  # or base64 representation (for binary data)

  read_data_strip = "\r+"  # strip all Carriage Return control characters
  read_data_beginning = "^Updated,.+$"
  read_data_beginning_drop = true  # data matching read_data_beginning won't be included (default: true)
  read_data_end = "^post-decompress\s+bytes,\d+.+$"
  read_data_end_drop = false  # data matching read_data_end will be included (default: true)

  # ...

  data_format = "grok"
  name_override = "openvpn"
  grok_custom_pattern_files = ["/path/to/openvpn_status.grok"]
  grok_patterns = ["%{OPENVPN_STATUS}"]

Current behavior:

I see no way to achieve this except using exec and calling external tool or pipeline, as far as I know.

Desired behavior:

To enable to send commands to interactive socket interfaces and parse result with for example Grok rulesets and to avoid using the exec plugin workarounds.

Use case:

There would be plenty of use cases imaginable...

For instance, I would like to connect to the OpenVPN management CLI as I do with telnet or socat

$ telnet localhost 7505
status
OpenVPN STATISTICS
Updated,Tue Nov 13 09:48:06 2018
TUN/TAP read bytes,648
TUN/TAP write bytes,0
TCP/UDP read bytes,295989
TCP/UDP write bytes,328425
Auth read bytes,96720
pre-compress bytes,0
post-compress bytes,0
pre-decompress bytes,0
post-decompress bytes,0
END
quit
$

and gather the metrices with Grok ruleset. Using the proposal example above the resulting data to be further parsed would be the following:

TUN/TAP read bytes,648
TUN/TAP write bytes,0
TCP/UDP read bytes,295989
TCP/UDP write bytes,328425
Auth read bytes,96720
pre-compress bytes,0
post-compress bytes,0
pre-decompress bytes,0
post-decompress bytes,0

Note: The examples given in this feature request are intentionaly written in a little weird way to demonstrate desired features and function.

danielnelson commented 5 years ago

I think this is too much of a change for socket_listener and it would't mesh well with the declarative style of the config file, you really do need to write a program to describe what actions to take to have the right amount of control.

I think instead we should add an plugin for OpenVPN to gather these statistics. I've wanted to add something to parse the status file and it could also run the status command and possibly other commands on the management interface. If you think this is a good idea could you open an issue for adding an OpenVPN plugin?

BTW here is what the status file shows:

OpenVPN CLIENT LIST
Updated,Tue Nov 13 11:43:40 2018
Common Name,Real Address,Bytes Received,Bytes Sent,Connected Since
loaner.lan,xxx.xxx.xxx.xxx:xxxx,165980,182333,Tue Nov 13 11:32:22 2018
ROUTING TABLE
Virtual Address,Common Name,Real Address,Last Ref
52:54:00:5b:93:55,loaner.lan,xxx.xxx.xxx.xxx:xxxx,Tue Nov 13 11:43:38 2018
GLOBAL STATS
Max bcast/mcast queue length,5
END
mjf commented 5 years ago

TLDR;

@danielnelson I think some people would like to have the general ability to write to sockets running CLI. There is so many software you could gather measurements from using it's interactive CLI out there! Not every piece of software is so modern to provide InfluxDB line format to be consumed by Telegraf or even sent directly to InfluxDB... Therefore to have the ability to perform at least such a minimalistic (or rather "limited") interaction as shown in this feature request would be IMHO simply splendid.

The problem with custom input plugins is that not everyone is a programmer in Go, thus completely dependant upon the community effort. There is plenty of (often custom) software only few people use but still would like to gather metrices from too where there is almost no chance to get community write some plugin etc. The OpenVPN was given as an example only (the first thing that came into on my mind the time I was writing it).

For me calling an external script every 10th second or so seems to be quite wasteful and slow in some cases. The general purpose Grok plugin is fast enough for most situations (as far as I know for now) and it suites many cases very well. To connect to a socket and have the possibility to perform some basic stuff such as to write some request and sanitize response from the CLI and then parse it using Grok or another filter is quite a good idea (imagine that for example HAProxy does similar thing for service checks). Why not Telegraf?

Why not to provide some general feature to interact with "things" like sockets (TCP, UDP, Unix Domain), named pipes, or ordinary files (imagine Telegraf writes something somewhere and somewhere else something appears and Telegraf collects and processes it - why not), or even input.exec's STDIN, STDOUT and STDERR, etc.?

Off-topic:

Is it because you managed people to have to write specialized plugins you benefit from stating "Hey, we have got input and output plugins for this and that already included..." where simple general purpose interface would be good enough in many cases, at least for the beginning solution to a problem if there is the need? I personaly strongly agree with custom plugins in general for it would be always better, more performant (and perhaps more secure), than some Grok parsing, that's clear! On the other hand interfaces often change and to update Grok ruleset seems easier than to update Go structures and code.

But, when you want to monitor something and need to wait half of an year to get the possibility - that's really bad. And that's exactly what I think general purpose capabilities should be included (of course, there's still the input.exec posibility, right). There would be always the risk then you would never know what are people using the capabilities for for they will be satisfied with it and have no reason to share what are they (ab)using it for. But it could IMHO make Telegraf better and more usable product...

Or is it because you simply do not want people asking for help with such rather general purpose capabilities? Because, yes, once existing, people would (ab)use it for many many weird things! But that's what's it all about, right? Give the users as much powers as you could and do not bother with what they use the powers for. You still can just deny to answer such questions without loosing your face, that's all. (By the way, me for instance already abuse the input.file plugin with Grok ruleset to gather XFS statistics from /proc and/or /sys and it works like charm and without some significant or bad overhead. And there is so many interesting numbers all around the /proc and /sys filesystems there is still no native Go input plugin written for yet! And many software with interactive interface too...)

danielnelson commented 5 years ago

I wouldn't be opposed to a new input plugin that writes to a socket and reads the response, socket_listener just accepts data so it wouldn't work, but I don't want to get into scripting the protocol interaction as it could be very complicated and this would be poorly described in toml. I also don't want to add a preprocessing stage that runs before the parser, it complicates the parser model because now there are two parse steps and actually I think grok can handle the example without these options.

I do prefer to add a custom plugin for things like this though because it will be much easier to use and provide a consistent and curated metric schema.

These types of monitoring tasks are exactly the use case for the exec plugin, the only real downside is the cost of fork/exec the command. We are planning to add a form that will control a long running subprocess so that won't necessarily be a limiting feature either. I also want to add more plugins that can interact with user customizable code: exec processor, exec parser, exec serializer. These should all support long running programs too.

mjf commented 5 years ago

@danielnelson I mostly agree with you. But the problem of the input.exec plugin is the cost of it, as you say...

Consider I would like to collect some measurement every second or few of them using the input.exec plugin with i.e. some Perl script just because I need to perform such a simple interaction with some sort of CLI simply because of having no other chance to obtain the measurement. That would surely produce lot of CPU workload I could avoid most of if only a native (very simple one, as suggested, of course) and consistent support for interactive interfaces existed.

So, shall I request new plugin, say input.socket_send_receiver (or whatever name you liked)? I think the preprocessing can be omited except one important thing - you simply have to sanitize and normalize newlines while parsers are often line-oriented, both on input/send/write (some textual protocol expect CRLFs on input, some do not etc) and output/receive/read (normalized to platform-specific newlines). Also the features could be limited to newline-oriented I/O so that we can finally end up in something as simple as:

[[input.socket_send_receiver]]

  service_address = "tcp://host.domain.tld:7505"  # OpenVPN management as an example

  # ...

  newline = "\n"  # default: platform-specific
  send = [ "status", "quit" ]

  # ...

  data_format = "grok"
  name_override = "openvpn"
  grok_custom_pattern_files = ["/path/to/openvpn_status.grok"]
  grok_patterns = ["%{OPENVPN_STATUS}"]

That's starting to look like something that could be quite easy to achieve (and perhaps directly in the input.socket_listener plugin), right?

danielnelson commented 5 years ago

I think this would make sense, a plugin that works somewhat like telnet or a restricted netcat. It would just send a string and would read from the socket until EOF, a new connection would be made every interval. I think TOML has flexible enough strings that we could just use a single string for sending, as it supports the normal escapes or multiline strings.

[[inputs.socket_response]]
  service_address = "tcp://host.domain.tld:7505"
  send_data = """\
              status\
              quit\
              """
  data_format = "grok"
  name_override = "openvpn"
  grok_custom_pattern_files = ["/path/to/openvpn_status.grok"]
  grok_patterns = ["%{OPENVPN_STATUS}"]

The parser, selected by data_format, would deal with newlines however it likes, but most of them today would split by line. I'm not 100% sure if the grok parser handles being given multiple lines, but it should be made to handle it as we want to be able to have it deal with multiline logs using a join rule.

mjf commented 5 years ago

@danielnelson OK, it looks good-enough to me too. But I would still really like to see some support for rare EOL combinations...

[[inputs.socket_response]]
  # ...
  send_data = ["status", "quit"]
  newline = "\r"   # optional; default: platform-specific
  # ...

Othwerwise I see no way to support i.e. Mac newlines with unix Telegraf client. This way you provided maximal flexibility and all the combinations possible could be covered, I hope. But if you do not like it I am personaly just fine with your proposal either.

The parser, selected by data_format, would deal with newlines however it likes, but most of them today would split by line. I'm not 100% sure if the grok parser handles being given multiple lines, but it should be made to handle it as we want to be able to have it deal with multiline logs using a join rule.

I am not sure what's exactly the "join rule" but I imagine it as a sort of mechanism to join lines based on some sort of matching (preferably using regular expressions)?

My idea was that it could be solved by setting newline = "" (to empty string, so that the plugin reads until the End of Transmission, timeout or a specific match - preferably in the order) and by some sort of "pre-processing" substitution(s) based on regular expressions...

[[inputs.some_plugin]]

  newline = ""                  # optional; default: platform-specific
  # terminator = "(\r\n){2}"    # optional; regex, default: EoF
  # timeout = "1000ms"          # optional; default: OS-specific

  # Substitution rules

  [inputs.some_plugin.grok.substitution.1]
    match = "(?P<foo>some_regex)\r\n"   # mandatory
    replace = "{{foo}}\n"               # optional, default: ""
    only = [1, 3, 5, 7]                 # optional: default: [] for all matches

  [inputs.some_plugin.grok.substitution.2]
    match = "(?P<bar>another_regex)\r\n"
    replace = "{{bar}}\n"

Or just setting the newline = "" and using the "join rules"... ?

danielnelson commented 5 years ago

Do you think using universal newlines would be enough? Essentially a newline is [\r\n]+.

I'm not sure exactly how to do the join rules, but the other main use case is to parse multiline log messages such as logged stack traces. Logspout has https://www.elastic.co/guide/en/logstash/2.4/plugins-codecs-multiline.html

mjf commented 5 years ago

@danielnelson I would just like to see it configurable (with some sane default). For the case of parsing the results the general [\r\n]+ may be good-enough but may be not for sending the newlines to the remote party! The latter should be configurable (which I am suggesting from the beginning).

The way Logspout does the joining looks quite good but I had no time to dive into a little bit more. (I will soon, I hope, sorry.)


I hope it's clear but the terminator in the last example I provided is the general transmission terminator (i.e. EoF, EoT, the fact the socket was closed by the remote party etc.) and should be used to identify that we collected all data needed to be parsed or are having incomplete data (which may or may not make a harm to the parser which I can not know... For instance something is sending HTTP headers to us and we are waiting for empty line and once having it we close our connection etc). I thing some limited "heuristic" should exist...

SudoNova commented 2 years ago

Hi. Without addressing this issue, as syslog input plugin lacks the ability to set ownership and mode of unix socket it binds to, it becomes impossible to use socket capability without messing with permissions. If you want to address it in syslog input plugin you will end up with duplicate code paths. Addressing this issue can help with lots of scenarios regarding log shipping.

mjf commented 2 years ago

Hello team.

During the years I've found several protocols that don't have dedicated input plugins in Telegraf and work as simple textual (mostly) protocols or interfaces in their nature. All of them could had been read with this sort of generalized input plugin! The variety of them starts from specialized software providing some sort of textual interface (such as the OpenVPN mentioned in this feature request) and ends with proprietary protocols (for example for gathering metrics from simple network-enabled thermometers or humidity sensors, et cetera). All of them are not as widely used as to be worth even considering writing dedicated input plugin for them! Albeit the [inputs.exec] plugin certainly can be used to gather the data from them it always has been and is very "ugly" and sub-optimal workaround (consuming unnecessary extra resources in addition). I must therefor rely on my opinion that generalized socket input plugin is worth the effort of writing it and that it should work in a generalized send/expect/parse way (some inspiration can be taken from HAProxy's TCP textual and binary checks and the idea extended).

@danielnelson Daniel, I suppose there has been no progress since the time I created this feature request, right?

srebhan commented 1 year ago

To everybody here, I don't think we should do this as no matter how you cut this it will grow into a general-purpose protocol implementation language. I can already hear people asking for "but I need one more step to trigger data-sending" or "I need the value of the answer in my next request" etc. So if you need some protocol to gather data, implement a plugin for that protocol! We can reorganize code such that it is easy to share code with the socket-listener plugin etc...