philhagen / sof-elk

Configuration files for the SOF-ELK VM
GNU General Public License v3.0
1.49k stars 277 forks source link

Timezone probem #217

Closed edvrfn closed 3 years ago

edvrfn commented 3 years ago

The logs are being sent to filebeat 5044 syslog, Results in Kibana are 4 hours behind. For example Log sent by firewall 12:45AM Histogram for last 4 hours is from 3:45AM to 7:45AM Log received @timestamp as seen in Kibana is 4:15AM

So the histogram remains empty for 3hrs and 30 minutes. Changing timezone in kibana just hanges the time values, time delta and histogram remains. Any clue how can I fix this.

philhagen commented 3 years ago

are your records sent in UTC or a local time zone? All source records should be in UTC.

philhagen commented 3 years ago

also, have you changed the SOF-ELK VM's timezone configuration?

edvrfn commented 3 years ago

@philhagen Thanks for reaching back.. I checked the settings on my UDM and the only localization setting i see for timezone is under user interface and i think this has to do with statics it shows on dashboard only..besides VM timezone is in UTC And still for last 5hours window histogram only fills first 30 minutes..

[root@sof-elk ~]# timedatectl
      Local time: Thu 2021-05-06 18:36:21 UTC
  Universal time: Thu 2021-05-06 18:36:21 UTC
        RTC time: Thu 2021-05-06 23:18:44
       Time zone: UTC (UTC, +0000)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

++++++++++++++++++ And the JSON from a LOG message ++++++++++++++++++

{
  "_index": "logstash-2021.05.06",
  "_type": "_doc",
  "_id": "Z-f2QnkBaVadVBIc2bT4",
  "_version": 1,
  "_score": null,
  "fields": {
    "destination_geo.country_name.keyword": [
      "United States"
    ],
    "type": [
      "syslog"
    ],
    "source_ip": [
      "192.168.2.80"
    ],
    "path": [
      "syslog from 192.168.3.1"
    ],
    "protocol": [
      "TCP"
    ],
    "destination_ip": [
      "140.82.114.25"
    ],
    "destination_geo.asn": [
      36459
    ],
    "type.keyword": [
      "syslog"
    ],
    "destination_geo.country_code3": [
      "US"
    ],
    "input_interface.keyword": [
      "br0"
    ],
    "host": [
      "192.168.3.1"
    ],
    "destination_geo.country_code2": [
      "US"
    ],
    "output_interface": [
      "eth4"
    ],
    "host.keyword": [
      "192.168.3.1"
    ],
    "input_interface": [
      "br0"
    ],
    "priority": [
      0
    ],
    "ports.keyword": [
      "12166",
      "443"
    ],
    "destination_geo.continent_code.keyword": [
      "NA"
    ],
    "ips": [
      "192.168.2.80",
      "140.82.114.25"
    ],
    "source_mac.keyword": [
      "e8:b1:fc:83:eb:68"
    ],
    "tags": [
      "process_live",
      "syslog",
      "_grokparsefailure_sysloginput",
      "got_syslog_timestamp",
      "got_syslog_hostname",
      "got_syslog_program",
      "got_iptables",
      "parse_done",
      "_geoip_lookup_failure"
    ],
    "eth_type": [
      "08:00"
    ],
    "destination_geo.location": [
      {
        "coordinates": [
          -97.822,
          37.751
        ],
        "type": "Point"
      }
    ],
    "destination_mac.keyword": [
      "76:ac:b9:1e:f2:fc"
    ],
    "destination_geo.continent_code": [
      "NA"
    ],
    "destination_geo.latitude": [
      37.751
    ],
    "protocol.keyword": [
      "TCP"
    ],
    "destination_geo.timezone.keyword": [
      "America/Chicago"
    ],
    "source_geo.asnstr": [
      "ASN: Not Available"
    ],
    "destination_port": [
      443
    ],
    "destination_geo.longitude": [
      -97.822
    ],
    "tags.keyword": [
      "process_live",
      "syslog",
      "_grokparsefailure_sysloginput",
      "got_syslog_timestamp",
      "got_syslog_hostname",
      "got_syslog_program",
      "got_iptables",
      "parse_done",
      "_geoip_lookup_failure"
    ],
    "ports": [
      "12166",
      "443"
    ],
    "eth_type.keyword": [
      "08:00"
    ],
    "uptime.keyword": [
      "667249.233229"
    ],
    "destination_geo.country_code2.keyword": [
      "US"
    ],
    "source_port": [
      12166
    ],
    "@version": [
      "1"
    ],
    "destination_geo.asnstr": [
      "ASN36459: GITHUB"
    ],
    "severity": [
      0
    ],
    "destination_geo.as_org.keyword": [
      "GITHUB"
    ],
    "destination_geo.timezone": [
      "America/Chicago"
    ],
    "answer_geo.asnstr": [
      "ASN: Not Available"
    ],
    "destination_mac": [
      "76:ac:b9:1e:f2:fc"
    ],
    "destination_geo.country_code3.keyword": [
      "US"
    ],
    "message": [
      "<12>May  6 16:17:12 UDM,50c9676c6d24,udm-1.9.3.3438 kernel: [667249.233229] IN=br0 OUT=eth4 MAC=76:ac:b9:1e:f2:fc:e8:b1:fc:83:eb:68:08:00 SRC=192.168.2.80 DST=140.82.114.25 LEN=69 TOS=0x00 PREC=0x00 TTL=127 ID=40450 DF PROTO=TCP SPT=12166 DPT=443 WINDOW=512 RES=0x00 ACK PSH URGP=0 MARK=0x5 \n"
    ],
    "ttl": [
      127
    ],
    "source_mac": [
      "e8:b1:fc:83:eb:68"
    ],
    "uptime": [
      "667249.233229"
    ],
    "destination_geo.country_name": [
      "United States"
    ],
    "@timestamp": [
      "2021-05-06T18:35:44.400Z"
    ],
    "destination_geo.as_org": [
      "GITHUB"
    ],
    "output_interface.keyword": [
      "eth4"
    ],
    "facility": [
      0
    ]
  },
  "sort": [
    1620326144400
  ]
}
philhagen commented 3 years ago

Thanks for the detail. It looks like the message isn't being parsed at all, so the timestamp is not being pulled to replace the "current" time. The "message" field reflects 16:17:12 but "@timestamp" is 18:35:44. The _grokparsefailure_sysloginput tag makes it look like the internal syslog handler itself is not properly parsing the message... I'm seeing if I can trigger the syslog-archive date/program/pid/etc parsing without breaking anything else.

This appears to be because the UDM is not shipping proper syslog for some reason..

philhagen commented 3 years ago

alright this should fix it - I wasn't sure which branch you're on, but I this should hit all the current ones. run sudo sof-elk_update.sh and let it "take" (wait a few min for logstash to restart). Let me know if it seems to fix the problem.

I'm still puzzled why the logs are being shipped so long after their timestamp, but this should at least get things where they belong...

philhagen commented 3 years ago

it also looks like the logs are shipped with the controller time zone, so while it looks like yours are set ok, I was not able to get mine into UTC... (I only have it in local time so the Protect app's watermarks are local clock-correct....)

edvrfn commented 3 years ago

@philhagen Thank again for looking into this. I ran the script and now the histogram is exactly 7 hours blank. If there is any more info i can provide you, please let me know.

philhagen commented 3 years ago

Given the odd behavior of late (and possibly out of cycle??) messages from the UDM, can you expand your time window to something like "last 15 months" to see if the records are there and just not caught up? Are you certain the syslog packets are received (e.g. test with tcpdump and ensure firewall port is still open)?

The sample log entry you sent, as well as live data from my UDMP both parsed fine in my tests, so I'm unable to reproduce what you're describing.

philhagen commented 3 years ago

also, it wasn't clear from your comment if you were sending new log data to SOF-ELK. Only new data will be handled - anything previous or existing will not be re-parsed.

edvrfn commented 3 years ago

@philhagen Yes i am certain syslog messages are received but now i am thinking it is not sending logs as UTC but in Pacific Timezone, syslog_timestamp seems to be correct PDT.

root@sof-elk filebeat]# tcpdump -A -i ens33 port 5514 -c 5 -vv
tcpdump: listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
22:27:10.468372 IP (tos 0x0, ttl 64, id 4829, offset 0, flags [DF], proto UDP (17), length 344)
    gateway.54089 > sof-elk.5514: [udp sum ok] UDP, length 316
E..X..@.@............I...D5:<14>May 12 15:50:01 TDM,50c9676c6d24,udm-1.9.3.3438 stahtd: stahtd[27514]: [STA-TRACKER].stahtd_dump_event(): {"message_type":"STA_ASSOC_TRACKER","mac":"6e:3f:82:7b:c1:f4","vap":"ra0","event_type":"failure","assoc_status":"96","auth_failures":"3","event_id":"1","auth_ts":"276104.721905"} - n2p, n2auth, flags: 0x60

22:27:10.630466 IP (tos 0x0, ttl 64, id 4835, offset 0, flags [DF], proto UDP (17), length 316)
    gateway.54089 > sof-elk.5514: [udp sum ok] UDP, length 288
E..<..@.@............I...().<14>May 12 15:50:01 TDM,50c9676c6d24,udm-1.9.3.3438 stahtd: stahtd[27514]: [STA-TRACKER].stahtd_dump_event(): {"message_type":"STA_ASSOC_TRACKER","mac":"6e:3f:82:7b:c1:f4","vap":"ra2","event_type":"failure","assoc_status":"32","event_id":"1","auth_ts":"276104.721920"} - n2p, flags: 0x20

22:27:23.011547 IP (tos 0x0, ttl 64, id 5139, offset 0, flags [DF], proto UDP (17), length 316)
    gateway.54089 > sof-elk.5514: [udp sum ok] UDP, length 288
E..<..@.@............I...(..<14>May 12 15:50:15 TDM,50c9676c6d24,udm-1.9.3.3438 stahtd: stahtd[27514]: [STA-TRACKER].stahtd_dump_event(): {"message_type":"STA_ASSOC_TRACKER","mac":"ec:35:86:2e:59:a4","vap":"ra0","event_type":"failure","assoc_status":"32","event_id":"1","auth_ts":"276118.833916"} - n2p, flags: 0x20

22:27:23.082601 IP (tos 0x0, ttl 64, id 5146, offset 0, flags [DF], proto UDP (17), length 316)
    gateway.54089 > sof-elk.5514: [udp sum ok] UDP, length 288
E..<..@.@............I...(..<14>May 12 15:50:15 TDM,50c9676c6d24,udm-1.9.3.3438 stahtd: stahtd[27514]: [STA-TRACKER].stahtd_dump_event(): {"message_type":"STA_ASSOC_TRACKER","mac":"ec:35:86:2e:59:a4","vap":"ra2","event_type":"failure","assoc_status":"32","event_id":"1","auth_ts":"276118.833933"} - n2p, flags: 0x20

22:28:00.738699 IP (tos 0x0, ttl 64, id 5523, offset 0, flags [DF], proto UDP (17), length 325)
    gateway.54089 > sof-elk.5514: [udp sum ok] UDP, length 297
E..E..@.@............I...1r8<14>May 12 15:50:57 TDM,50c9676c6d24,udm-1.9.3.3438 stahtd: stahtd[27514]: [STA-TRACKER].stahtd_dump_event(): {"message_type":"STA_ASSOC_TRACKER","mac":"80:c7:55:e6:cf:49","vap":"ra0","event_type":"failure","assoc_status":"2048","event_id":"1","auth_ts":"276169.254837"} - n2whtctrl, flags: 0x800

and in below timestamp seems to be in UTC, while syslog_timestamp seems to be correct PT, and strangely received_at is behind PT

<html>
<body>
<!--StartFragment-->

@timestamp | 2021-05-12 08:52:28.000Z
-- | --
  | @version | 1
  | _id | eOyyYnkBZwRG47Aa91Y7
  | _index | logstash-2021.05.12
  | _score | -
  | _type | _doc
  | answer_geo.asnstr | ASN: Not Available
  | destination_geo.asnstr | ASN: Not Available
  | facility | 1
  | host | 192.168.3.1
  | message | mcad: mcad[27957]: wireless_agg_stats.log_sta_anomalies(): bssid=74:ac:b9:1e:f3:01 radio=rai0 vap=rai0 sta=6e:3f:82:7b:c1:f4 satisfaction_now=60 anomalies=tcp_latency
  | path | syslog from 192.168.3.1
  | priority | 0
  | received_at | 2021-05-12 15:29:26.354 -07:00
  | severity | 6
  | source_geo.asnstr | ASN: Not Available
  | syslog_hostname | tdm
  | syslog_program | tdm,50c9676c6d24,udm-1.9.3.3438
  | syslog_timestamp | May 12 15:52:28
  | tags | process_live, syslog, got_syslog_timestamp, got_syslog_hostname, got_syslog_program, got_syslog_timestamp, _gpfail, fail_sl03, got_syslog_hostname, _mutate_error, got_syslog_program, got_syslog_pid, _defaultparse
  | type | syslog

<!--EndFragment-->
</body>
</html>

I don't have a 15 year history. I ended up resetting my UDM and deploying a new sof-elk VM

philhagen commented 3 years ago

hmm.... All SOF-ELK processing assumes UTC unless TZ is explicitly stated in ISO8601 formatted timestamps. where is the fragmented HTML output from? Nothing in SOF-ELK should have any knowledge of TZ offsets except "+00:00".

I am at a loss about the weird order though. If anything, I saw messages arrive with a timestamp several-many minutes in the PAST, via some sort of queued unspooling. All I can think of here is that some sort of combination of the odd timezone offsets and the unidentified potential unspooling behavior could lead to this.

edvrfn commented 3 years ago

@philhagen here is what i do to get SOF-ELK started after importing the VM, just to give more info on the background, although this should not be related.

Add '-Djava.net.preferIPv4Stack=true' in jvm.options file to make it listen on IPv4, in both Elasticsearch and Logstash directories. For Elasticsearch I had to add another extra line in elasticsearch.yml file 'cluster.initial_master_nodes: node-1' as it won't start with just the line above. And make it listen on all interfaces using 0.0.0.0 Then run 'sudo sof-elk_update.sh' and forward logs to filebeats on port 5514

HTML is just how it was copied in the discover section, same place from where I had copied JSON earlier.

Thanks again for taking time to have a look on this.