sematext / logagent-js

Extensible log shipper with input/output plugins, buffering, parsing, data masking, and small memory/CPU footprint
https://sematext.com/logagent
Apache License 2.0
389 stars 79 forks source link

Standard HTTP output? #149

Closed alexbowers closed 5 years ago

alexbowers commented 5 years ago

Hi,

I'm trying to find out how to send the log lines to a HTTP server (not elastic search or an rtail server).

Is this possible?

I've tried using logagent --http-proxy 'http://localhost:9200' -g 'logs/*.log' but no request gets made to the proxy server.

otisg commented 5 years ago

I don't see anything like that listed in output plugins. Maybe @megastef will know. JFYI, there's also a mailing list/forum/group for questions - see the bottom of https://sematext.com/logagent/

megastef commented 5 years ago

@alexbowers It would be interesting to know exactly your use case. Which HTTP server would receive the logs?

The command above logagent --http-proxy 'http://localhost:9200' -g 'logs/*.log' does not work because the Elasticsearch plugin is not activated (so no http requests ...).

Try

nc -tl 9200 &
logagent -g 'logs/*.log' -i mylogs -e http://127.0.0.1:9200` 

Netcat will print:

{"index":{"_index":"mylogs","_type":"logs"}}
{"@timestamp":"2018-12-17T21:43:49.632Z","message":"ABC","severity":"info","host":"imac.local","ip":"192.168.2.113","logSource":"unknown"}

Note the HTTP interface to Elasticsearch uses the Elasticsearc bulk indexing format (one line indexing command, one line data).

It should be easy to implement a plugin with plain HTTP output (one JSON doc per HTTP post request). But I would recommend a small buffer and post an Array of logs.

If you have interest to contribute the HTTP output plugin, then you could simply remove the the Slack specific code from the Slack ouptut plugin: Documentation: https://sematext.com/docs/logagent/output-plugin-slack/ Code: https://github.com/sematext/logagent-js/blob/master/lib/plugins/output/slack-webhook.js Module name / alias definition: https://github.com/sematext/logagent-js/blob/master/bin/logagent.js#L63

Another interesting way might be to use the MQTT Broker output plugin, you can subscribe via WebSocket to the MQTT Broker.

I hope this helps. We would be glad to add HTTP output plugin. On the other hand we should specifiy a format (one log per post request or multiple logs per request, line delimited JSON format or JSON array etc).

alexbowers commented 5 years ago

@megastef Thanks for the reply.

My interest is in trying to get the logs sent into a custom written application for us to do some looking at.

The logs are not necessarily typical apache logs or anything like that, they may be custom logs written that we want to push into some of our monitoring system thats built in house.

We could store it in Elastic search and then try to query and pull the data out of there, but we'd like to get it direct into our system via a HTTP post request to the application itself if possible.

I think having the batching makes sense, since it would be not ideal to saturate the application with post requests more than necessary.

I'll have a think on this a little more and have a look at how to do a raw HTTP plugin.

megastef commented 5 years ago

There is also InfuxDB output plugin using HTTP: https://sematext.com/docs/logagent/output-plugin-influxdb/ and influx protocol sounds like a good choice for monitoring data.

The InfluxDB plugin implements also batch/buffering and uses InfluxDB line protocol format. https://github.com/sematext/logagent-js/blob/master/lib/plugins/output/influxdb.js

Interesting we have so many HTTP plugins, that we should consider to generalize those and support N formats in the POST request :)

chiefy commented 5 years ago

Evaluating logagent, we intend to install the agent as a DaemonSet in Kubernetes workers and then tail the container logs. It'd be nice if we could just forward the logs to custom HTTP/TCP/UDP for further processing (like Logstash or Fluentd etc.) Was surprised to look at the output plugins and see nothing about forwarding raw messages?

otisg commented 5 years ago

What sort of further processing are you after? Perhaps that could be done with Logagent itself (and keep the ingestion pipeline simpler)?

megastef commented 5 years ago

@chiefy check the list of plugins: https://sematext.com/docs/logagent/plugins/ I don't understand why you need Logstash. Lagagent can parse and enrich container logs ... and forward to ZeroMQ, Gelf, MQTT, Apache Kafka, Elasticsearch, InfluxDB, Files, Sematext Cloud

Setup on Kuberentes is described here: https://sematext.com/docs/logagent/installation-docker/

Please read https://sematext.com/blog/docker-container-monitoring-with-sematext/#toc-container-logs-0

megastef commented 5 years ago

Implemented output-http plugin with ld-json format (JSON lines) in https://github.com/sematext/logagent-js/commit/11ad607800cad165321fedae5a69cb3419fba5d8

Set maxBufferSize to to 1 and logagent sends one JSON object per log event. Otherwise logagent sends N json objects separated by '\n'.

See example config file: https://github.com/sematext/logagent-js/blob/master/config/examples/output-http.yaml

input: 
  stdin: true 

output: 
  http-forwarding:
    module: output-http
    url: http://127.0.0.1:8086
    # flush interval in seconds
    flushInterval: 1
    # max buffer size to force flush
    maxBufferSize: 1
    debug: true
    tags: 
      role: backend
      version: "1.0.0"
      region: eu
    # optional filter settings matching data field with regular expressions
    filter: 
      field: logSource
      match: .*

run

cat some.log | logagent --config output-http.yml

Contributions for alternative formats in the HTTP body are welcome ...

otisg commented 5 years ago

Doesn't "Buffer" make you think of "bytes"? @megastef maybe maxBulkSize would be better than maxBufferSize?