Decoding json with Elastic Agent and beats

dmgeurts commented 11 months ago

I'm trying to set up parsedmarc using an existing ELK stack, hence not shipping to localhost:9200.

Instead, I'm storing the json files and letting a Custom Beats agent policy collect the json files. So far so good. However, where I'm stuck is the (file)beat config as Filebeat documentation suggests using processors to decode_json_fields, I don't yet know which ones are used by the Kibana dashboard. https://www.elastic.co/guide/en/beats/filebeat/8.10/filtering-and-enhancing-data.html#decode-json-example

The error messages I see from Elastic Agent are:

[elastic_agent.filebeat][error] Error decoding JSON: unexpected EOF
[elastic_agent.filebeat][error] Error decoding JSON: json: cannot unmarshal string into Go value of type map[string]interface {}
[elastic_agent.filebeat][error] Error decoding JSON: invalid character '}' looking for beginning of value

This link has provided some inspiration: https://discuss.elastic.co/t/dec-4th-2022-en-ingesting-json-logs-with-elastic-agent-and-or-filebeat/319536

I was hoping not to need fancy grok filtering etc, but simply import all the json data. I'll go read the Kibana dashboard json file now to see which fields are required, but if someone has managed to get this working using Filebeat, then I'd love to see your config.

dmgeurts commented 11 months ago

I'm slowly making progress, but it would help to know what an entry in Elasticsearch is meant to look like.

dmgeurts commented 11 months ago

I've made some progress with a simple aggregate record. But when looking at aggregate reports with multiple sources in them I end up stumped as json errors are thrown and I'm not sure if they're due to duplicate fields or other issues.

https://discuss.elastic.co/t/parse-single-array-json-elastic-agent/345558/3

dmgeurts commented 11 months ago

And then one finds out that the json in the log file is different to what's sent to Elasticsearch? Kibana looks for spf_aligned and when I search for this text I only get a hit against elastic.py.

I don't want to sound ungrateful, but I wish it was easier to use an existing ELK stack with parsedmarc when using Elastic Agent. I thought I read somewhere that the log file json was identical to what was sent to Elastic, maybe I read this wrong?

domainaware / parsedmarc

Decoding json with Elastic Agent and beats #440