pfelk / pfelk

pfSense/OPNsense + Elastic Stack
https://pfelk.github.io/pfelk/
Other
1.07k stars 191 forks source link

Suricata Syslog-NG fails to send data to log. #309

Closed swiftbird07 closed 3 years ago

swiftbird07 commented 3 years ago

Describe the bug Syslog-NG can't connect to ELK server: May 31 18:49:03 pPfSense syslog-ng[84145]: Syslog connection failed; fd='25', server='AF_INET(192.168.178.83:5544)', error='Operation timed out (60)', time_reopen='60'

To Reproduce Steps to reproduce the behavior:

  1. Follow the config tutorial here step by step: https://github.com/pfelk/pfelk/wiki/How-To:-Suricata-on-pfSense
  2. See the error in the log and no logs in ELK

Firewall System (please complete the following information):

Operating System (please complete the following information):

Elasticsearch, Logstash, Kibana (please complete the following information):


 - Logstash logs:
 ```[2021-05-31T19:02:52,620][INFO ][logstash.inputs.tcp      ][pfelk][962cec7469706e9430f859fa2972f67505fa42bb8184e56a3d98140a5f757266] Starting tcp input listener {:address=>"0.0.0.0:5544", :ssl_enable=>false}
[2021-05-31T19:02:52,620][INFO ][logstash.inputs.tcp      ][pfelk][962cec7469706e9430f859fa2972f67505fa42bb8184e56a3d98140a5f757266] Starting tcp input listener {:address=>"0.0.0.0:5544", :ssl_enable=>false}
[2021-05-31T19:02:52,623][INFO ][logstash.inputs.tcp      ][pfelk][pfelk-suricata] Starting tcp input listener {:address=>"0.0.0.0:5040", :ssl_enable=>false}
[2021-05-31T19:02:52,688][INFO ][logstash.inputs.udp      ][pfelk][pfelk-2] Starting UDP listener {:address=>"0.0.0.0:5141"}
[2021-05-31T19:02:52,703][INFO ][logstash.inputs.udp      ][pfelk][pfelk-1] Starting UDP listener {:address=>"0.0.0.0:5140"}
[2021-05-31T19:02:52,724][INFO ][org.logstash.beats.Server][pfelk][Beats] Starting server on port: 5044
[2021-05-31T19:02:52,736][INFO ][logstash.inputs.udp      ][pfelk][pfelk-haproxy] Starting UDP listener {:address=>"0.0.0.0:5190"}
[2021-05-31T19:02:52,798][INFO ][logstash.inputs.udp      ][pfelk][pfelk-2] UDP listener started {:address=>"0.0.0.0:5141", :receive_buffer_bytes=>"106496", :queue_size=>"2000"}
[2021-05-31T19:02:52,799][INFO ][logstash.inputs.udp      ][pfelk][pfelk-1] UDP listener started {:address=>"0.0.0.0:5140", :receive_buffer_bytes=>"106496", :queue_size=>"2000"}
[2021-05-31T19:02:52,798][INFO ][logstash.inputs.udp      ][pfelk][pfelk-haproxy] UDP listener started {:address=>"0.0.0.0:5190", :receive_buffer_bytes=>"106496", :queue_size=>"2000"}
[2021-05-31T19:02:52,829][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:pfelk], :non_running_pipelines=>[]}

Additional context

May 31 19:00:03 pPfSense syslog-ng[18498]: Syslog connection failed; fd='19', server='AF_INET(192.168.178.83:5544)', error='Operation timed out (60)', time_reopen='60'

Destination:

{ tcp("siem.myserver.com" port(5544) failover( servers("192.168.178.83") ) );

Edit:

I saw that some packets are actually coming through on the ELK server-side:

18:07:21.324842 IP pPfSense.fritz.box.29682 > localhost.5544: Flags [S], seq 49256776, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 4053693443 ecr 0], length 0
18:08:32.348214 IP pPfSense.fritz.box.18315 > localhost.5544: Flags [S], seq 3646265543, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3469549295 ecr 0], length 0
18:08:33.345994 IP pPfSense.fritz.box.18315 > localhost.5544: Flags [S], seq 3646265543, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3469550289 ecr 0], length 0
18:08:35.540675 IP pPfSense.fritz.box.18315 > localhost.5544: Flags [S], seq 3646265543, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3469552491 ecr 0], length 0
18:08:39.746386 IP pPfSense.fritz.box.18315 > localhost.5544: Flags [S], seq 3646265543, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3469556690 ecr 0], length 0
18:08:47.948405 IP pPfSense.fritz.box.18315 > localhost.5544: Flags [S], seq 3646265543, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3469564899 ecr 0], length 0
18:09:04.156079 IP pPfSense.fritz.box.18315 > localhost.5544: Flags [S], seq 3646265543, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3469581106 ecr 0], length 0
18:09:36.354825 IP pPfSense.fritz.box.18315 > localhost.5544: Flags [S], seq 3646265543, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3469613305 ecr 0], length 0
18:10:47.362318 IP pPfSense.fritz.box.28816 > localhost.5544: Flags [S], seq 1025063735, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 286242617 ecr 0], length 0
18:10:48.363086 IP pPfSense.fritz.box.28816 > localhost.5544: Flags [S], seq 1025063735, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 286243622 ecr 0], length 0
18:10:50.566597 IP pPfSense.fritz.box.28816 > localhost.5544: Flags [S], seq 1025063735, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 286245825 ecr 0], length 0
18:10:54.758961 IP pPfSense.fritz.box.28816 > localhost.5544: Flags [S], seq 1025063735, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 286250017 ecr 0], length 0

Hopefully someone can help.

a3ilson commented 3 years ago

@revere521 - are you still running pfsense w/syslog?

swiftbird07 commented 3 years ago

@revere521 - are you still running pfsense w/syslog?

Yeah I had to switch back to Pfsense because of reliability problems. At the moment I run Suricata with just syslog but only because sending them via syslog-ng doesn’t work. But even with syslog the events that are occurring in Kibana are just first-stage populated (they are in the right pfelk-suricata-* index but no fields are parsed) I think this is because pfelk doesn’t like cut-off events.

So basically I don’t have any usable Suricata events at the moment. Syslog-NG fails and Syslog/FW logs are cutoff/bugged so that no alert is recognized.

a3ilson commented 3 years ago

@maof97 - the truncated message was a problem with pfsense (not really, they were adhering to the rfc); however, syslog-no via tcp should alleviate the issue. I do not run pfsense natively but there are past issues (closed) and the wiki was devised after their solutions.

Hopefully @revere521 can confirm as I believe they are utilizing it.

revere521 commented 3 years ago

I have been using syslog-ng with pfsense 2.5.1 and suricata for a few months with no issues at all. I did my setup based on the info in the Wiki. @maof97 - let me know if you want to compare to my setup.

swiftbird07 commented 3 years ago

I have been using syslog-ng with pfsense 2.5.1 and suricata for a few months with no issues at all. I did my setup based on the info in the Wiki. @maof97 - let me know if you want to compare to my setup.

Yeah that would be great!

revere521 commented 3 years ago

This is what i have in the Syslog NG service settings in pfsense: most of this is default, and the cert for my VPN is in there pre-populated, i don't think it matters

SyslogNG1

you create the inputs and outputs alongside the default ones SyslogNG2

SyslogNG3

SyslogNG4

SyslogNG5

and my Suricata Output settings: Suricata_Log_output

swiftbird07 commented 3 years ago

Yeah I have these settings too. Only difference is that I don't have a certificate installed and that I don't log TLS/Encrypted Data to EVE. Do you also have a file in /etc/pfelk/config.d/01-inputs-custom.conf? I saw in another threat that you have to do that to accept the TCP connection on ELK side:

## 01-inputs-custom.conf
input {
  tcp {
    id => "pfelk-3"
    type => "firewall-3"
    port => 5549
  }
}

Is this not necessary?

As I am getting traffic to the ELK side but no response was logged in tcpdump together with the fact that Syslog-NG says itself that the "Operation timed out (60)" (and not that no route was found etc.) I think that something is up on the logstash-side (or syslog-ng is bugged).

revere521 commented 3 years ago

It pre-populated some CA/certificate info there, but i don't actually use it.
I have the default 01-inputs.conf from the repo which looks at port 5040 for the suricata eve:

input {
  ### Firewall 1 ###
  udp {
    id => "pfelk-1"
    type => "firewall-1"
    port => 5140
    #ssl => true
    #ssl_certificate_authorities => ["/etc/logstash/ssl/YOURCAHERE.crt"]
    #ssl_certificate => "/etc/logstash/ssl/SERVER.crt"
    #ssl_key => "/etc/logstash/ssl/SERVER.key"
    #ssl_verify_mode => "force_peer"
  }
  ### Firewall 2 ###
  udp {
    id => "pfelk-2"
    type => "firewall-2"
    port => 5141
    #ssl => true
    #ssl_certificate_authorities => ["/etc/logstash/ssl/YOURCAHERE.crt"]
    #ssl_certificate => "/etc/logstash/ssl/SERVER.crt"
    #ssl_key => "/etc/logstash/ssl/SERVER.key"
    #ssl_verify_mode => "force_peer"
  }
  ### Suricata ###
  tcp {
    id => "pfelk-suricata"
    type => "suricata"
    port => 5040
    #ssl => true
    #ssl_certificate_authorities => ["/etc/logstash/ssl/YOURCAHERE.crt"]
    #ssl_certificate => "/etc/logstash/ssl/SERVER.crt"
    #ssl_key => "/etc/logstash/ssl/SERVER.key"
    #ssl_verify_mode => "force_peer"
  }

It depends on how divergent your setup is from whats in the repo - mine is current and looks for "pfelk-suricata" as the ID and "suricata" as the type on 5040.

I also use program-override ("suricata") to make sure syslog NG sets that as the type for the log files.. thats mentioned in the wiki, but not shown in the example.

Do your ports match? you setup in both places - 5549?

swiftbird07 commented 3 years ago

It depends on how divergent your setup is from whats in the repo - mine is current and looks for "pfelk-suricata" as the ID and "suricata" as the type on 5040.

My pfSense is on the newest version. Is there a way to upgrade pfelkl and elastic itself without breaking the installation? I am very cautious on the elastic part as I really don't want to loose any data.

I also use program-override ("suricata") to make sure syslog NG sets that as the type for the log files.. thats mentioned in the wiki, but not shown in the example.

I have this override too .

Do your ports match? you setup in both places - 5549?

Yes they are both the same and ufw is off.

I will try to use the method you have of using just the one config.

revere521 commented 3 years ago

I'm not sure you could completely overhaul without loosing data, and to be honest i'm not sure how far back you would need to go to hit the changes that sorted this out.... but i'm sure alot has changed - especially over the last year.

You could keep your indices for historic data, but hitting the repo now would undoubtedly have breaking changes. I can say it works very well, and is very stable... so the trade off might be getting this to work requires a new starting point.

swiftbird07 commented 3 years ago

I'm not sure you could completely overhaul without loosing data, and to be honest i'm not sure how far back you would need to go to hit the changes that sorted this out.... but i'm sure alot has changed - especially over the last year.

You could keep your indices for historic data, but hitting the repo now would undoubtedly have breaking changes. I can say it works very well, and is very stable... so the trade off might be getting this to work requires a new starting point.

I looked it up. I have ES/Kibana version 7.12. Thats not so old right? I don't think that's the problem. I checked the installer-script from pfelk and according to it I have version 20.03 which is pretty recent.

revere521 commented 3 years ago

ok, i wasn't sure if it was much older, the version i appear to have is 21.02, and the version of the elk stack itself shouldn't really matter, you can upgrade those with apt-get independent of the config without loosing data. The only way you would loose data is with significant field changes in the patterns (necessitating new indices).

In your logstash logs above i can see its starting listeners for UDP on 5040 and 5041, but i don't see TCP on 5549..and the packets are going to 5544...It may be the input config needs sorting out

swiftbird07 commented 3 years ago

In your logstash logs above i can see its starting listeners for UDP on 5040 and 5041, but i don't see TCP on 5549.

The first entry shows it:

[2021-05-31T19:02:52,620][INFO ][logstash.inputs.tcp ][pfelk][962cec7469706e9430f859fa2972f67505fa42bb8184e56a3d98140a5f757266] Starting tcp input listener {:address=>"0.0.0.0:5544", :ssl_enable=>false}

I think its so different because I used another separate conf file that time (but now I am using your method).

and the packets are going to 5544...It may be the input config needs sorting out

I changed the port to yours now so that I could be sure to be on the same level. Also I have OPNsense connected as well and its Suricata works very well without integration problems of any kind.

I attached my conf.d folder. Maybe you can diff that or something but I am pretty sure that my settings are ok.

conf.d.2.zip

Are you also on Pfsense 2.5.1-RELEASE (amd64) ?

revere521 commented 3 years ago

At a high level, it looks like everything is setup correctly, i didn't do a diff though. Its possible @a3ilson might notice something there as well.

I am on pfsense 2.5.1-RELEASE (amd64) - so should be relatively similar, and i'm using the 6.0.0_10 package of Suricata, and 1.15_7 of syslog_ng.

I also have my syslog set to RFC 5424 with RFC 3339 timestamps in the general logging settings for PFsense - maybe that is the issue?

image

a3ilson commented 3 years ago

Takes look at the default from 01-inputs.conf, lines 32-42.

You indicated that you are using tcp. Since you are not receiving logs, I suspect that you may need to configure your certifications and/or ssl. Those fields are commented out by default and will require access to your cert/key.

swiftbird07 commented 3 years ago

Takes look at the default from 01-inputs.conf, lines 32-42.

You indicated that you are using tcp. Since you are not receiving logs, I suspect that you may need to configure your certifications and/or ssl. Those fields are commented out by default and will require access to your cert/key.

I tried to switch to UDP on both sites and now syslog-ng doesn’t give an error anymore and tcpdump definitely recognizes large packets now. Only problem is that the packets are still not processed by Kibana.

a3ilson commented 3 years ago

You may need to restart (purge indices) logstash and/or kibana.

On Jun 8, 2021, at 15:44, maof97 @.***> wrote:

 Takes look at the default from 01-inputs.conf, lines 32-42.

You indicated that you are using tcp. Since you are not receiving logs, I suspect that you may need to configure your certifications and/or ssl. Those fields are commented out by default and will require access to your cert/key.

I tried to switch to UDP on both sites and now syslog-ng doesn’t give an error anymore and tcpdump definitely recognizes large packets now. Only problem is that the packets are still not processed by Kibana.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

swiftbird07 commented 3 years ago

purge indices

Is systemctl restart elasticsearch kibana logstash enough to do that?

swiftbird07 commented 3 years ago

So I looked a bit and I took a closer look at the initial Grok Pattern in 03-filter.conf and compared it with a Syslog event I got from a captured pcap.

This is the example log I got of a suricata alert:

<13>Jun  9 03:41:36 pPfSense suricata: {"timestamp":"2021-06-09T03:41:36.439202+0200","flow_id":1904558625305506,"in_iface":"vtnet0","event_type":"alert","src_ip":"192.168.178.64","src_port":55494,"dest_ip":"192.168.178.87","dest_port":53,"proto":"UDP","alert":{"action":"allowed","gid":1,"signature_id":39867,"rev":4,"signature":"INDICATOR-COMPROMISE Suspicious .tk dns query","category":"Misc activity","severity":3},"app_proto":"dns","payload":"9ckBAAABAAAAAAAACWNvaW1pcmFjZQJ0awAAQQAB","stream":0}

And this is the Grok pattern in the 03-filter.conf:

 grok {
      match => {"message" => "%{POSINT:[log][syslog][priority]}?(%{INT:[log][syslog][version]}\s*)?(%{SYSLOGTIMESTAMP:[event][created]}|%{TIMESTAMP_ISO8601:[event][created]})\s(%{SYSLOGHOST:[host][name]}\s+)?%{PROG:[process][name]}\s*?(\[)?%{POSINT:[process][pid]}(\]:)?\s*(\-\s*\-)?\s*%{GREEDYDATA:filter_message}|%{POSINT:[log][syslog][priority]}?(%{INT:[log][syslog][version]}\s*)?(%{SYSLOGTIMESTAMP:[event][created]}|%{TIMESTAMP_ISO8601:[event][created]})\s(%{SYSLOGHOST:[host][name]}\s+)?%{PROG:[process][name]}\:\s%{GREEDYDATA:filter_message}"}
      }

I tried using the Kibana Grok debugger and it gives me an error: "Provided Grok patterns do not match data in the input" (I just used the part of the Grok pattern inside the '"').

I think it has to do with this right? If it can't recognize it, it obviously won't display it in Kibana.

Maybe a fix is needed? I am not good with Grok so maybe someone an take a look on it.

revere521 commented 3 years ago

it looks like your alerts are coming over a "ppfsense" instead of just "pfsense" - do you have an override in the syslog-ng that has an extra "p"

since processing starts with the matching in 02-types.conf - the type for the observer name has to exactly match "pfsense" and then the type for observer type has to equal "suricata" to pass the parsing to the next steps on 03-filter.conf and so on.

That may actually be the failure point

swiftbird07 commented 3 years ago

Ok changed it to "PfSense" now (didn't even see that). Still got nothing in Kibana sadly:

<13>Jun 11 00:47:14 PfSense suricata: {"timestamp":"2021-06-11T00:47:13.755419+0200","flow_id":1564620455249627,"in_iface":"vtnet0","event_type":"alert","src_ip":"192.168.178.64","src_port":64058,"dest_ip":"192.168.178.87","dest_port":53,"proto":"UDP","alert":{"action":"allowed","gid":1,"signature_id":39867,"rev":4,"signature":"INDICATOR-COMPROMISE Suspicious .tk dns query","category":"Misc activity","severity":3},"app_proto":"dns","payload":"ns8BAAABAAAAAAAAA3d3dw1jb21wZm9vdHBuaXJvAnRrAABBAAE=","stream":0}
revere521 commented 3 years ago

is the discover for pfelk-suricata-* completely empty?

swiftbird07 commented 3 years ago

is the discover for pfelk-suricata-* completely empty?

Yes there is nothing.

revere521 commented 3 years ago

does the discover for the pfelk-firewall-* have any events that have "suricata" in them (the missing packets) but generating _grokparsefailures?

If the log messages are being recieved, and getting to logstash, kibana should be putting them somewhere - but as parse failures.

If thats happening, something isn't being Grok'd in your setup like you considered. If thats not happening, it may be some other issue with the logs being recognized early in the logstash config before any patter matching.

Is this a business or something where you can't risk data loss?

If not i would either purge the indices like @a3ilson suggested (Stop Logstash, then in Kibana - Management, Stack Management, Index Management and delete all the indices, then start logstash again -- and let them all recreate)

or start over by running the installer to update all the files and configs, and you may still need to delete the indices.

I am running the most recent files from the repo, unmodified - with the pfsense settings i posted (and my pfelk box and pfsense box are on the same LAN subnet at home, using TCP with no certs) and it all appears to work correctly.

swiftbird07 commented 3 years ago

I did that but it did not work... But at the end I just completely reinstalled ELK and pfelk and now it works. Guess it was some random config I have overseen or something. Thanks @revere521 and @a3ilson anyway for the great support. I hope you continue with you awesome project :)