elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.17k stars 4.91k forks source link

Filebeat Fortinet Fortigate Module #13245

Closed philippkahr closed 4 years ago

philippkahr commented 5 years ago

Filebeat Module for Fortinet FortiGate network appliances

This checklist is intended for Devs which create or update a module to make sure modules are consistent.

Modules

For a metricset to go GA, the following criterias should be met:

Filebeat module


Hi guys,

I am currently working on a module to map Fortinet particularly Fortigate log output into Elasticsearch. I already have a FortiGate setup with Logstash, however, I always wanted to write a module and create various mappings.

1.) I copied the cisco module from the X-Pack section 2.) renamed all to fit Fortinet and FortiGate 3.) I am currently inserting all the fields in the fields.yml before I can work on the pipeline processor. That's where the first questions come to mind.

Fortinet log line looks something like this:

date=2019-08-14 time=21:05:46 devname="FGT-1" devid="FG101E4Q17004281" logid="0001000014" type="traffic" subtype="local" level="notice" vd="root" eventtime=1565809548757679313 tz="+0200" srcip=198.168.100.2 srcport=4482 srcintf="vlan2001" srcintfrole="wan" dstip=212.12.112.206 dstport=9205 dstintf="root" dstintfrole="undefined" sessionid=3199275 proto=6 action="deny" policyid=0 policytype="local-in-policy" service="tcp/9205" dstcountry="Austria" srccountry="United States" trandisp="noop" duration=0 sentbyte=0 rcvdbyte=0 sentpkt=0 appcat="unscanned" crscore=5 craction=262144 crlevel="low" mastersrcmac="e0:5f:b9:ff:b5:01" srcmac="e0:5f:b9:ff:b5:01" srcserver=0

Handling GeoIP

It already incorporates a dstCountry. Does it still make sense to let the GeoIP lookup be run in the pipeline? I have difficulties with the field mapping of dstCountry? Should I just drop it and perform GeoIP enrichment to get real long/lat values? What is the best way to deal with that?

Handling KV processor

Well, since FortiGate is sending proper Syslog standard 5424, everything will be dissected properly. I want to run the KV processor on all the fields. But how do I do that, if my grok pattern looks like this?

 - grok: 
     field: message
     patterns: 
      - "%{SYSLOG5424PRI:syslog_index}%{GREEDYDATA:message}"

Since I have to add a field, a field separator and a value separator to the KV processor I am not quite sure.

Mapping to ECS scheme

FortiGate has some weird naming conventions. I am also performing a mapping to the ECS scheme. Should I properly just rename the fields when I receive them from the KV, or would aliases be more of a fit?

Adding SIEM functionality

I have no idea where to start, could someone point me in the right direction?

philippkahr commented 5 years ago

@andrewkroh I hope it is ok that I directly ping you, as I have seen that you have worked on the cisco pipeline and did a great deal of work there.

andrewkroh commented 5 years ago

It already incorporates a dstCountry. Does it still make sense to let the GeoIP lookup be run in the pipeline?

Yeah, I would still run the geoip processor on the data even though it has country info already. The geoip processor will populate more fields. I assume that the dstCountry is also coming from some kind of database lookup (probably also maxmind) so I don't think it's an issue to overwrite the value.

I have difficulties with the field mapping of dstCountry? Should I just drop it and perform GeoIP enrichment to get real long/lat values? What is the best way to deal with that?

I'm not sure what difficulties you're having, but I think it should be mapped to destination.geo.country_name which is part of ECS. And because this is part of ECS you won't need to add anything to the module's fields.yml file.

I want to run the KV processor on all the fields. But how do I do that, if my grok pattern looks like this?

I think it would be something like this:

 - grok: 
     field: message
     patterns: 
      - "%{SYSLOG5424PRI:syslog_index}%{GREEDYDATA:message}"
  - kv:
      field: message
      target_field: fortinet.fortigate
      field_split: ' '
      value_split: '='
      strip_brackets: true

Should I properly just rename the fields when I receive them from the KV, or would aliases be more of a fit?

Renaming them is best.

I have no idea where to start, could someone point me in the right direction?

There's not much to do. If the data that your new module produces follows ECS then it will show up in the SIEM app. In 7.3 the SIEM app's widgets have an "Inspect" option that will show you the exact query being used. With this you can know what the data needs to look like to be shown in that widget.

elasticmachine commented 5 years ago

Pinging @elastic/siem

philippkahr commented 5 years ago

I will continue to work on this module as soon as the KV processor works the same as the logstash one or at least honors the " around values. The Fortinet syslogs are different depending on which kind of rule triggers the output, thus the KV is my only choice to deal with key value pairs appearing and disappearing.

enotspe commented 4 years ago

We have developed a set of logstash pipelines for ingesting (and enriching) Fortinet logs. Right now we support Fortigate and Fortisandbox, but we are planning on supporting more products soon.

https://github.com/enotspe/fortinet-2-elasticsearch

BFLB commented 4 years ago

Hi @philippkahr, I need to collect Fortigate logs as well and have some experience based on an implementation on Nagios-Log-Server (Elasticsearch based) in the past. What is the current state of your work? Is there anything you can share already? I would highly appreciate it and would like to collaborate. I think we should really push this toward an official filebeat module Thank you in advance. Bernhard

P1llus commented 4 years ago

After discussing with @andrewkroh I will be picking this up, focusing on making a fortinet module, starting with a NGW fileset, and hoping to expand it with further filesets later down the line (would be a separate issue/PR for those).

A PR will be linked to this once progress has been made.

Bernhard-Fluehmann commented 4 years ago

Hi @P1llus , this is great news. Thank you very much. In the meantime I have implemented basic FortiGate (NGFW) support using a standard file input, CEF processor and custom ingest pipeline, partially ECE conform. If you are interested in the code, please let me know.

P1llus commented 4 years ago

Hello @BFLB . That's really great thanks! While I will be focusing on the common syslog format for fortigate, we do have a CEF input plugin for filebeat (syslog and logfile). I think if you have already made progress mapping CEF for fortigate then it might make a good addition there. Do you happen to have the implementation available somewhere for public? Or would you like to keep that part closed (which is totally fine by the way)?.

Bernhard-Fluehmann commented 4 years ago

Hi @P1llus , I had no time so far to provide you with the information. In the meantime I have seen that you have already implemented it. It is looking great. Thank you very much for your work. I am looking forward to try it out soon.

P1llus commented 4 years ago

@Bernhard-Fluehmann Good to hear :) Yeah the PR is a WIP, but i'm planning on adding all my local changes to it soon, so should hopefully be completed in a couple of days if you want to test it from the master branch.

SimSama commented 4 years ago

I regularly write parsers for SIEM environments (non-elastic), and would be interested in helping you guys with this. FortiGate firewalls have a few variations of syslog header, and the (assuming grok) parser would need to catch either of them. The body of the log messages are standard key-value pairs.

FortiOS v5.4.x has a variation of syslog header, v5.6.x to v6.x has 2 variations of syslog header, assuming you use the default, non CEF format. FortiWeb, FortiAnalyzer, FortiManager, etc all have slightly different log header formats to identify the device.

I'll probably write my own parser on logstash and will share when done. I don't like the idea that every device type / format has to have a unique syslog target port as I see in some examples above. Although it adds some complexity to the parsing, I think all devices should be able to send to standard syslog port of UDP 514, and the grok parser should wither through known log format structures to assign what the device type is, and parse the body of the log appropriately.

Slightly related to this, while the "SIEM" aspect of elastic is still early in its lifecycle, it really needs to have a central parser management capability. We should be able to define log parsers in Kibana, and push to managed logstash instances (or filebeats as modules) on demand. Imagine the pain of having even 10 logstash instances remotely collecting and forwarding logs, and having to maintain which has which module, and which has what version of module.

Just some thoughts, I hope this is the right place to talk about them.

P1llus commented 4 years ago

Hello @z0n3z3r0. Thanks for the feedback!

The current implementation has now been reviewed and merged with our master branch and does indeed use the default non CEF header/format.

If you are interested in looking at the current mapping we have, you can find it here: ECS Mapping: https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-module-fortinet.html#_fortinet_ecs_fields

Fortigate fields: https://www.elastic.co/guide/en/beats/filebeat/master/exported-fields-fortinet.html

The structure of filebeat modules is that they can often have one module and multiple filesets. Each fileset can have their own parser, and the current release have the "firewall" fileset, which is for fortigate, and will only accept the format presented by Fortigate sources.

My long goal here is to add the other sources as filesets, so that it can support the other products in the portfolio as well, like the above you mentioned, but there is no timeline for that yet (or if it will happen).

If you find any issues in the parsing or mapping, feel free to create a new issue or a PR, then we will take a look :) Current parsers are here, starting with pipeline.yml and then branches out to either 3 other pipelines depending on event type: https://github.com/elastic/beats/tree/master/x-pack/filebeat/module/fortinet/firewall/ingest

When it comes to parser management, I think it would need a separate topic on our discussion forums or in a separate issue, the roadmap is not something we can openly disclose, but feel free to look around the public github issues.

Liqui12 commented 4 years ago

Hi team, I am also facing parsing issues with filebeat fortinet module 2020-08-10T11:00:00+03:00 172.2.3.3 date=2020-08-10 time=10:59:58 instead of reading the logs from the port, i am initially saving all the logs in a file and then running fortinet module to read the file. I suspect the parsing error is due to the timestamp and IP inserted in the logs.

Thanks

P1llus commented 4 years ago

@Liqui12 I would recommend creating a new issue since this one is closed. We do parse from file as well without any issues, as all our testdata is read from files, so it should not be any issue.

Would you be able to create a new issue with a testfile that has 2-3 lines of data that you cannot parse? Feel free to anonymise any part of it that you might want.