philhagen / sof-elk

Configuration files for the SOF-ELK VM
GNU General Public License v3.0
1.47k stars 274 forks source link

syslog sent live is not processed correctly #144

Closed Melantrix closed 5 years ago

Melantrix commented 5 years ago

I am setting up Sof-Elk as a home server to process a few syslog and netflow sources. Netflow is working as intended now, but syslog is not being processed correctly.

i have a pi-hole and it's also running passivedns. Every record thats being sent in that contains passivedns information is processed and shows up in the dashboard.

However, when i send in syslog from my synology, it is not being processed correctly, and i think i found out where the error lies: in the file 1100-preprocess-syslog.conf there is a whole section that's supposed to process syslog entries with a PRI value at start of a line. However, this section is only processed if the entry has a tag with the value: "process_archive". So any live log will not be processed correctly. For live logs only this will be done:

if "process_live" in [tags] {
      mutate {
        rename => {
          "program" => "syslog_program"
          "logsource" => "syslog_hostname"
          "pid" => "syslog_pid"
          "timestamp" => "syslog_timestamp"
        }
        add_tag => [ "got_syslog_timestamp", "got_syslog_hostname", "got_syslog_program" ]
        ### DEBUG
        ###add_field => { "orig_message" => "%{message}" }
      }
    }

    if "syslog" in [tags] {
      mutate {
        add_field => { "path" => "syslog from %{host}" }
      }
    } else if "relp" in [tags] {
      mutate {
        add_field => { "path" => "relp from %{host}" }
      }
    } else if "filebeat" in [tags] {
      mutate {
        add_field => { "path" => "filebeat: %{[host][name]}:%{source}" }
      }
    } else if !("file" in [tags]) {
      mutate {
        add_field => { "path" => "unknown syslog source" }
      }
    }

The entry in the screenshot is how a log entry will show up in dashboard.

2019-02-03_22-59-53

I haven't looked into it further, but my guess is it should be fairly easy to correct this, as its only an if statement that's keeping it from processing it correctly. I'm not very familiar with the inner workings of ELK, but if i have time i can look into this and see if i can straighten the if statements out a bit. Could you maybe take a quick look and see if what i'm seeing is correct and if this is the place to correct it? Then i know that i'm looking in the right place and am not wasting time trying to fix an issue that's not there :)

philhagen commented 5 years ago

that screenshot is super helpful - thank you!

It looks like you're using the IETF format for the logs - does the BSD format work?

I think the failed parse is a combination of a few things:

You're correct that the PRI is (currently) only pulled from the archive file ingest pipeline - I'd never seen that added in a live message, as the facility/severity are sent as a part of the syslog protocol itself. I can add that logic to the parser (might be a bit before I can get that all tested and merged to master), but I think we'd still see problems with the two other integers.

I can take a look but any insight you can provide to the first integer would be greatly helpful.

Melantrix commented 5 years ago

Hi Phil,

I have checked and tested a bit:

first: BSD format is possible on the synology side, i can choose to send it in the BSD format but somehow it does not go into elasticsearch. If i check via the Discover tab i only see the IETF format show up. Is there a way to check somewhere in the log files if it get's processed? (i searched in /var/log but couldn't find a log file that looked related to this.) I did test BSD by sending it via UDP and TCP and it doesn't make a difference.

That brings me to the integer. I get the 136 integer when i send the IETF format via TCP. But when i send it by UDP it does not show up with the integer. So i'm guessing it's related to that? Going out on a limb here but could it be a tcp flag integer or something? If i decode it correctly it would be an Urgent and SYN flag, so it doesn't sound logical, but you never know..

Attached another screenshot that shows the different message via TCP and UDP. 2019-02-05_22-15-36

If you could maybe give me some pointers where i can look for the raw input i can maybe check if i can see what's going on with the BSD format and if i can relate something to the tcp integer. Or any other questions or things that are unclear, i'm happy to help (and learn in the process :))

philhagen commented 5 years ago

thanks for your patience - I think this is more an issue of bad data from the Synology, plus incomplete parsing on the part of the Logstash syslog plugin.

I tested sending BSD format, and the log was received and indexed appropriately, so I'm not sure what may have been the issue there. It may be a time zone issue between NAS, browser, and SOF-ELK. Anything NOT set to UTC will cause issues. My suggestion is to use BSD format until I can get full IETF format handling into the pipeline.

2019-02-26_08-10-28

Melantrix commented 5 years ago

just for your information, i have created a support ticket with Synology and this is their reply:

Thank you for waiting and terribly sorry for the delay.

According to our developers, this has been identified as a known issue. We are working hard to address this in a future release. Please kindly stay tuned for the update, and in the meantime, we apologize for any inconvenience caused.

Thank you for your report and continued support.

Melantrix commented 5 years ago

@philhagen I have received another message from Synology, which clarifies the <136> integer at the beginning of the IETF message. this is their response:

After further investigation by our developers, they found that IETF format is designated in RFC 5424.

There is also RFC 5425, RFC 5426 and RFC 6587 designation of IETF format which split into TLS, UDP and TCP protocol.

According to RFC 6587 Section 3.4.1, this number refers to log length. It's function is for splitting up multiple logs under TCP connections. And accordingly, Log Center's syslog-ng framework adheres to RFC standard.

What might be a possibility is that the log aggregator in question that you are using may not be adhering to RFC standard. Could you kindly double check that to be sure?

Thank you!

Best regards,

If i understand correctly it is part of a 'sub' RFC, so they have the RFC police on their side :P I'm not sure if this is the correct place to add this to, as this issue is closed. But I will also add it to the #149 thread for completeness.