eht16 / python-logstash-async

Python logging handler for sending log events asynchronously to Logstash.
MIT License
186 stars 51 forks source link

Why do you add '\n' to each log line? #36

Closed melonux closed 5 years ago

melonux commented 5 years ago

https://github.com/eht16/python-logstash-async/blob/6bae7fbcb5be06f6b72db9b526fc6597061a0d29/logstash_async/handler.py#L134

It makes the outputs of StreamHandler and yours are not identical. Would you please remove it?

eht16 commented 5 years ago

The trailing new line is necessary for the UDP/TCP input in Logstash. Without the trailing slash, events will not be received properly in Logstash (in my tests only the first event succesfully made it into Logstash, all following events were lost).

What exactly do you mean by not identical in terms of the StreamHandler and this one? StreamHandler works completely different.

I just updated the documentation to mention the config should use the codec json_lines which was json before but Logstash automatically changes the codec to json_lines with the supported inputs: https://github.com/eht16/python-logstash-async/commit/ff2877a49fc1cf87c1b404a1a0bfe1690d7d20d0

melonux commented 5 years ago

I tried to send log without tailing slash.

echo -n "1" | nc -u 192.168.96.22 3333
echo -n "2" | nc -u 192.168.96.22 3333
echo -n "3" | nc -u 192.168.96.22 3333
echo -n "4" | nc -u 192.168.96.22 3333

It works and all go to logstash.

192.168.127.52 1
192.168.127.52 2
192.168.127.52 3
192.168.127.52 4

I also tried remove b'\n' from your source code. It works fine. Therefore it seems it is safe to remove b'\n'. Does it related to specific version of logstash?

eht16 commented 5 years ago

Hmm, I tested with Logstash 6.4.2 and 6.7.1 with the following config:

    tcp {
        host => "127.0.0.1"
        port => 5959
        mode => server
        codec => json_lines {}
    }

You are using UDP where it actually works because Logstash does not switch the codec to line resp. json_lines.

So yes, the trailing newline should probably only added for TCP and Beats transports but not UDP. Will do soon.

melonux commented 5 years ago

FYI. I'm using logstash-6.7.1 and following config:

input {
  tcp {
    port => 3333
    codec => json
  }
  udp {
    port => 3333
    codec => json
  }
}
output {
    file {
     path => "/data/logstash/test-%{+YYYY-MM-dd}.txt"
     codec => line { format => "%{host} %{message}" }
         flush_interval => 0
   }
}
echo -n "1" | nc  192.168.96.22 3333
echo -n "2" | nc  192.168.96.22 3333
echo -n "3" | nc  192.168.96.22 3333
echo -n "4" | nc  192.168.96.22 3333

It also works for TCP. No line is omitted.

eht16 commented 5 years ago

Your netcat tests are a bit misleading as you send always only one single message through a single connection. Python-logstash-async usually sends multiple messages using a single connection.

Hence I created a more realistic and comprehensive example: https://gist.github.com/eht16/ab255866316296cff999742bd13d0f39

There you can see for UDP the trailing new line doesn't matter but with TCP it is required otherwise only the first message is received by Logstash and the rest gets lost.

But I still don't see any \n in the Logstash output which this whole issue is about?

One more note: codec => json {} in the TCP configuration block doesn't make much sense because Logstash will change it to json_lines {} anyway. If you check your Logstash log, you should find a message like: [2019-04-07T11:03:18,725][INFO ][logstash.inputs.tcp ] Automatically switching from json to json_lines codec {:plugin=>"tcp"}

melonux commented 5 years ago

Thank you very much for your detailed explanation.