logstash-plugins / logstash-codec-protobuf

Codec plugin for parsing Protobuf messages
Apache License 2.0
26 stars 16 forks source link

Cannot decode string of bytes sent from pdns #15

Open jake2184 opened 6 years ago

jake2184 commented 6 years ago

I'm using pdns_recursor to send protobuf messages to logstash.

However the output of some fields is pretty incomprehensible.

The proto file is found here, and was converted using protocol-buffers.

Of particular annoyance are the 'from' and 'to' fields of the protobuf message. In the proto file they are sent as 'bytes', and looking at the pdns source code it looks like what is sent is a C++ string.

When decoded and printed using logstash stdout or passed into elastic search, instead of being given a nice string of the IP address (even in hex format and without '.' delimiter), I get:

"from" => "\n\x16\x02b" - which corresponds to 10.22.2.98

Changing the pb.rb file to attempt to decode as a string rather than bytes gives:

"from" => "\n\u0016\u0002b"

What I am trying to get is some form of consistent representation that I can work with to manipulate, with the end-goal of having "10.22.2.98" as the output.

IngaFeick commented 6 years ago

Hi @jake2184 , is this still relevant? I'm currently working on a new version and you could test-drive it in a couple of days, maybe it solves the issue.

jake2184 commented 6 years ago

I came to a rather hacky work around, involving unpacking the data:

event.set("from", event.get("from").unpack("C4").join("."));
event.set("to", event.get("to").unpack("C4").join("."));
event.set("messageId", event.get("messageId").unpack("H*").join(""));

And for the rData response, which may be IPv6 or IPv4:

response = event.get("response")
if ( response and response["rrs"] and response["rrs"][0] and response["rrs"][0]["rData"] )
   rdata = response["rrs"][0]["rData"]
   hex_value = rdata.unpack("H*").join("")
   ip_value =  rdata.unpack("C4").join(".")
   length_rdata =    rdata.unpack("L*").length
   if ( length_rdata >= 4 )
     event.set("[response][decoded_rdata]", hex_value )
   else
     event.set("[response][decoded_rdata]", ip_value)
   end
end

This is not great, but mostly works for the time being.

I look forward to a working solution and am happy to test, but cannot guarantee having time to do so (depending on work projects).

MangeshBhamre commented 6 years ago

Hi jake2184 ,

I am trying to do a similar thing with pdns_recursor and logstash. In which file did you add above code?

Regards, Mangesh

IngaFeick commented 6 years ago

@MangeshBhamre are you using protobuf 2 or 3?

MangeshBhamre commented 6 years ago

@IngaFeick protobuf 3. Latest.

In addition to above issue highlighted by @jake2184 . I get below errors. While i used logstash 5.2.x earlier this was working fine. Now I am on logstash 6.2.x

[2018-02-18T09:52:18,226][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<NoMethodError: undefined method `<<' for false:FalseClass>. [2018-02-18T09:52:19,205][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:19,209][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 13, expected 2 but got 7

<ProtocolBuffers::Field::MessageField:0x1b358717 @otype=:optional, @name=:response, @opts={}, @tag=13, @proxy_class=PBDNSMessage::DNSResponse>>.

[2018-02-18T09:52:19,414][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 9, expected 0 but got 6

<ProtocolBuffers::Field::Uint32Field:0x25d8638b @otype=:optional, @name=:timeSec, @opts={}, @tag=9>>.

[2018-02-18T09:52:20,208][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:20,477][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 9, expected 0 but got 1

<ProtocolBuffers::Field::Uint32Field:0x25d8638b @otype=:optional, @name=:timeSec, @opts={}, @tag=9>>.

[2018-02-18T09:52:20,901][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 8, expected 0 but got 7

<ProtocolBuffers::Field::Uint64Field:0x68544106 @otype=:optional, @name=:inBytes, @opts={}, @tag=8>>.

[2018-02-18T09:52:21,117][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 13, expected 2 but got 0

<ProtocolBuffers::Field::MessageField:0x1b358717 @otype=:optional, @name=:response, @opts={}, @tag=13, @proxy_class=PBDNSMessage::DNSResponse>>.

[2018-02-18T09:52:21,209][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:21,633][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:22,211][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:22,266][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:22,850][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:23,212][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:23,279][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 8, expected 0 but got 5

<ProtocolBuffers::Field::Uint64Field:0x68544106 @otype=:optional, @name=:inBytes, @opts={}, @tag=8>>.

[2018-02-18T09:52:23,280][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #. [2018-02-18T09:52:23,686][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 8, expected 0 but got 6

<ProtocolBuffers::Field::Uint64Field:0x68544106 @otype=:optional, @name=:inBytes, @opts={}, @tag=8>>.

[2018-02-18T09:52:23,902][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 9, expected 0 but got 3

IngaFeick commented 6 years ago

@MangeshBhamre the current version of the protobuf codec does not support protobuf 3 yet. No worries though, I'm about to write a PR for version 3 support. Can I interest you in testing the new version so that I can pick up your feedback before I send the PR?

MangeshBhamre commented 6 years ago

@IngaFeick Sure. I can test out.

How can I fix above errors?

IngaFeick commented 6 years ago

@MangeshBhamre with the 1.0.4 version that is the official latest: you cannot fix the errors because as I said, it doesn't support protobuf 3 yet. Please install this gem https://rubygems.org/gems/logstash-codec-protobuf/versions/1.0.5 and in your logstash input section, use the codec like so:

    {
      class_name => "Module.AnotherModule.Classname"
      include_path => ['file1_pb.rb','file2_pb.rb',file3_pb.rb']
      protobuf_version_3 => true
    }

The import change is the last line that activates the protobuf 3 library.

MangeshBhamre commented 6 years ago

@IngaFeick I installed protobuf coded using "logstash-plugin install logstash-codec-protobuf"

There is no url or version I used. How can I install the one you specified? Sorry, I am bit new here with logstash/protobuf.

IngaFeick commented 6 years ago

Hi @MangeshBhamre, no problem, let me quickly walk you through this. The command that you used will pull the latest official version. What you need to do instead is to download the gem file for the new version that I'm currently testing, from here: https://rubygems.org/downloads/logstash-codec-protobuf-1.0.5.gem Put it somewhere locally and then run

logstash-plugin install /path/to/downloaded/logstash-codec-protobuf-1.0.5.gem

You just need to change the last part of the command to the file that you pulled. Let me know if it worked, will you?

MangeshBhamre commented 6 years ago

Hi @IngaFeick

I successfully installed your gem. Also updated logstash config

codec => protobuf { class_name => "PBDNSMessage" include_path => ['/home/bitnami/dnsmessage.pb.rb'] protobuf_version_3 => true }

However logstash not starting... [2018-02-18T10:18:31,448][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"NoMethodError", :message=>"undefined method msgclass' for nil:NilClass", :backtrace=>["/opt/bitnami/logstash/vendor/local_gems/eb8a11f0/logstash-codec-protobuf-1.0.5/lib/logstash/codecs/protobuf.rb:80:inregister'", "/opt/bitnami/logstash/logstash-core/lib/logstash/codecs/base.rb:20:in initialize'", "/opt/bitnami/logstash/logstash-core/lib/logstash/plugins/plugin_factory.rb:90:inplugin'"

IngaFeick commented 6 years ago

@MangeshBhamre could be a problem with the class name. Can you show me the protobuf ruby file, dnsmessage.pb.rb please?

MangeshBhamre commented 6 years ago

Here is it on pastebin https://pastebin.com/1vbCKYVk

IngaFeick commented 6 years ago

@MangeshBhamre you're using protobuf 2, not 3. If you want to switch to proto3, which I recommend for performance reasons, then please use the official protobuf compiler to generate proto3 ruby files. Afterwards please send me a pastebin link to the new ruby class. You should get something like dnsmessage_pb.rb instead of dnsmessage.pb.rb. You will probably also have to change the class name in the config section like so:

codec => protobuf {
class_name => "PBDNSMessage.DNSResponse.DNSRR"
include_path => ['/home/bitnami/dnsmessage.pb.rb']
protobuf_version_3 => true
}

but I can confirm that only after you generated actual proto3 ruby files.

MangeshBhamre commented 6 years ago

@IngaFeick For me to use proto3, does that mean pdns also need to send me protobuf in ver3 format? (I guess so) If you look at .proto file from (https://doc.powerdns.com/recursor/lua-config/protobuf.html), it has syntax = "proto2"; I tried compiling this file with protoc (3.5.1) and its complaint about "syntax=proto2". This indicates to me that client and server both needs to be on same version. Please correct me here!

If I wish to use proto2 on logstash and resolve the error I am getting, how to go about it? As I said earlier, with logstash 5.2.x this worked fine, however for some reason we need to use logstash 6.2.x.

Thanks for all your help here.

IngaFeick commented 6 years ago

@MangeshBhamre

does that mean pdns also need to send me protobuf in ver3 format?

Not necessarily. You can read protobuf2 data with a protobuf3 definition. It's not an elegant thing to do though.

with logstash 5.2.x this worked fine, however for some reason we need to use logstash 6.2.x.

I haven't tested with logstash 6 yet. I will try to do that asap but it might take a couple of days, high workload currently. What you could test in the meantime is what jake did. As for your question

In which file did you add above code?

You'd put that in the filter section of your logstash config. There's a plugin called "ruby" that allows you to execute custom code like the one that he mentioned. The documentation is here: https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html Wanna give that a try?

MangeshBhamre commented 6 years ago

@IngaFeick I downloaded gems for all versions and installed it one-by-one. 1.0.0 did not give any errors to me. 1.0.2 and above throws all errors. Either 1.0.0 is not emitting the errors and consuming it internally or it's working correctly.

Is it ok to use 1.0.0 or are there any bugs and you recommend using 1.0.2 and above?

I will try ruby filter plugin you suggested. Looking forward to seeing 1.0.5+ to work for me without any errors and speed up my logstash

IngaFeick commented 6 years ago

@MangeshBhamre interesting observation. That's a helpful hint actually. I will look into it asap, thanks!

IngaFeick commented 6 years ago

@MangeshBhamre it should be safe to use 1.0.0, yes. The versions afterwards were related to speed improvement and code quality, afair, and not bugfixes.

MangeshBhamre commented 6 years ago

@IngaFeick I ran 1.0.0 overnight and found its too slow and missed almost 90% protobuf messages. I am using later version to not miss the events. It will be great if a newer version is available that address my list of errors.

IngaFeick commented 6 years ago

Hi @MangeshBhamre! Sorry for the late reply. We have upgraded codec to 1.1.0 with a fresher version of the google protobug lib under the hood. Maybe that tackles your problems? It's not officially released yet but you can download the version here for testing and then you install it by running

logstash-plugin install  /your_downloads/logstash-codec-protobuf-1.1.0.gem

Please note that the config has changed, you need to set protobuf_version => 3 now instead of protobuf_version_3 => true. Please let me know if this works for you, and if it doesn't and you still see data loss please describe how you measure that you are loosing messages. Thank you very much! :)

MangeshBhamre commented 6 years ago

Hi @IngaFeick

I tried it and not able to start logstash. Below are the errors seen. My config input { tcp { host => "0.0.0.0" port => 4040 codec => protobuf { class_name => "PBDNSMessage.DNSResponse.DNSRR" include_path => ['/home/bitnami/dnsmessage.pb.rb'] protobuf_version => 3 } } }

[2018-03-16T13:59:24,951][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.1.2"} [2018-03-16T13:59:25,488][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} [2018-03-16T13:59:31,386][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"NoMethodError", :message=>"undefined method msgclass' for nil:NilClass", :backtrace=>["/opt/bitnami/logstash/vendor/local_gems/41a7e7a9/logstash-codec-protobuf-1.1.0/lib/logstash/codecs/protobuf.rb:99:inregister'", "/opt/bitnami/logstash/logstash-core/lib/logstash/codecs/base.rb:20:in initialize'", "/opt/bitnami/logstash/logstash-core/lib/logstash/plugins/plugin_factory.rb:90:inplugin'", "/opt/bitnami/logstash/logstash-core/lib/logstash/pipe

IngaFeick commented 6 years ago

@MangeshBhamre are you sure you are using a protobuf 3 file? The compiler typically generates _pb.rb files for version 3, and yours ends in .pb.rb. Unless you renamed the file manually you might want to check which pb version it compiled to. Make sure that it is really protobuf 3 and then please upload the definition to gist or somewhere so that I can check the class name. Thank you!

MangeshBhamre commented 6 years ago

Hi @IngaFeick Looks like I need to stick to protobuf 2. If you look at https://raw.githubusercontent.com/PowerDNS/pdns/master/pdns/dnsmessage.proto it says proto2.

If I change it manually to proto3 and try to compile with protoc3, I get error message as below for all 'optional' fields.

dnsmessage.proto:78:14: Explicit 'optional' labels are disallowed in the Proto3 syntax. To define 'optional' fields in Proto3, simply remove the 'optional' label, as fields are 'optional' by default.

Additionally, as you said above, "It's not an elegant thing to do though."

Without moving to protobuf 3, how can I solve the problem ?

IngaFeick commented 6 years ago

Hi @MangeshBhamre, I don't recommend sticking with pb2 because the new lib is really much faster. Let's approach this differently: I can convert your protobuf definition into a version 3 ruby file for you tomorrow or on sunday and then you just use that. Should do the trick. I will send you that file this weekend. If you cannot wait and want this fixed immediately: just remove the quantifiers from each row (required / optional) and change syntax = "proto2"; to syntax = "proto3";. Then throw the google compiler with --ruby-out against it and with a high probability it should work. If not then let's talk on the weekend. Cheers!

MangeshBhamre commented 6 years ago

Hi @IngaFeick

I did what you recommended. Removed optional/Required and set 'proto3'. Additionally enum in proto3 requires first enum value to be 0. Added dummy value as well. Got successfully compiled now.

I tried new _pb.rb on logstash, however didn't work. Go below error. No protobuf messages were processed. Needed to revert back to old code.

[2018-03-17T02:15:37,832][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>.

Alternatively, I use your latest 1.1.0 gem with older logstash proto2 config and .pb.rb file. I see messages coming in. Still missing some messages and looks bit slow. Don't know if this is useful info. Anyways.

Thanks for all your inputs and work!

IngaFeick commented 6 years ago

@MangeshBhamre can you please provide the new _pb.rb file that you generated?

MangeshBhamre commented 6 years ago

@IngaFeick Here you go

_pb.rb - https://pastebin.com/7BTYBkLs

And edited .proto file as per your instructions - https://pastebin.com/JXWN6QSi

MangeshBhamre commented 6 years ago

@IngaFeick Hope above info is good enough for you. Let me know if you need anything more.

One important observation with ELK using this plugin over a long time (2-4 days)

Clearly, 1.0.5/1.1.0 is dropping documents due to errors OR 1.0.3 is faster. You may need to go back to 1.0.3 to see what it is doing right. Hope this helps.

IngaFeick commented 6 years ago

@MangeshBhamre

jake2184 commented 6 years ago

Still having many error messages from this.

It looks like the underlying library is throwing a lot of errors, e.g.: ProtocolBuffers::DecodeError: invalid message ProtocolBuffers::DecodeError: incorrect wire type undefined method '<<' for false:FlaseClass>

I am using the proto file generated from https://raw.githubusercontent.com/PowerDNS/pdns/master/pdns/dnsmessage.proto without edits.

I have updated the codec to 1.1.0 with no change.

IngaFeick commented 6 years ago

@jake2184 would you mind sharing your logstash input configuration? Thank you

jake2184 commented 6 years ago

Here: dns_logstash.txt

IngaFeick commented 6 years ago

@jake2184, you are using protobuf 2. The improvements that I announced are only available when you switch the codec to protobuf 3. Some simple steps are required:

jake2184 commented 6 years ago

As Mangesh described above, the sending software uses v2. I shouldn't need to upgrade to V3. I want to deviate from the v2 template as little as possible.

Are you stating that the plugin does not support v2 anymore?

I'm concerned with making it work, not about performance. Will upgrading to v3 fix any of the issues?

IngaFeick commented 6 years ago

Okay, background information: In the first versions of the plugin we have been using a protobuf library that was not written by google. Reason: the Google protobuf lib does not support protobuf 2, the other library did. Now more people are migrating to Protobuf 3 so it made sense to add that option. So we ship the codec with 2 completely different libraries and it uses the "old" one for protobuf version 2, and the official google one for protobuf 3. I cannot answer if the official google one will solve your problem, that is why I am asking you to test, but given the fact that it is written by google and very actively maintained, I think the probability of a bug fix for that particular problem is very high. Please be also aware that protobuf 2 and 3 are (mostly, and in your case most certainly) wire compatible. Meaning: The producer requires no changes. You keep sending protobuf 2 data, you just pretend that they are protobuf 3 when you parse them in logstash. I'm doing the exact same thing at work and it's fast and stable. Wouldn't recommend it otherwise :) You just need to compile the pseudo syntax 3 definitions once for ruby, but your producer keeps using the original syntax 2 definitions.

Also: of course the plugin still supports version 2. When you don't specify a version it will default to protobuf 2.

jake2184 commented 6 years ago

Thank you for that - it now makes more sense.

Upgrading to v3, I get the same errors as Mangesh above - invalid tag (zero).

In response to your comments, the producer of data is using the original v2 protobuf file from the PDNS website.

IngaFeick commented 6 years ago

@jake2184 thank you! Sorry to hear that. Can you estimate the percentage of messages for which this happens? Do they all fail or do some messages come through? If the 2nd applies, is it possible for you to log outgoing messages on the producer side and if so, could you identify and send me 1 or 2 of those that fail to be decoded?

jake2184 commented 6 years ago

With v3, no messages get through.

I can do a packet capture, which probably isn't much help, but I can't easily log what the software is sending - it's compiled C.

jake2184 commented 6 years ago

I managed to log a messages that were sent, using a Python decoder provided by PDNS: https://github.com/PowerDNS/pdns/blob/master/contrib/ProtobufLogger.py. The template was generated with protoc, but uses python not ruby/

The incoming serialized bytes:

08021210c398556ac5fb48c2b68a7d09b87f5e9e2001280132040a1622293a040a160263403448838bb5d80550908f125885a60162190a1365782d362e6f78616d2e696e7465726e616c2e100118016a30080012220a1365782d362e6f78616d2e696e7465726e616c2e1001180120b0092a040a16080a28838bb5d80530ed8e127a008a0100

Correctly decodes to:

[2018-05-29 13:20:19.296848] Response of size 52: 10.22.34.41 -> 10.22.2.99 (UDP), id: 21253, uuid: c398556ac5fb48c2b68a7d09b87f5e9e requestorid:  deviceid: 
- Question: 1, 1, ex-6.****.internal.
- Query time: 2018-05-29 13:20:19.296813
- Response Code: 0, RRs: 1
     - 1, 1, ex-6.****.internal., 1200, 10.22.8.10

I don't think the issue is with the sent data, but due to some part of the logstash decoding.

IngaFeick commented 6 years ago

Thank you @jake2184 , I will look into this. Will be next week though, due to vacation (sorry!).

jake2184 commented 6 years ago

No problem, thank you.

IngaFeick commented 6 years ago

@jake2184, couple of updates: first my apologies, I have given you incomplete instructions on how to migrate the protobuf definitions to version 3. You also need to make sure that all enums have an option with value 0. Here's a ruby version of it that definitely works, you can maybe use it for testing again while I look at the other stuff. It might (or might not) address the zero tag issue. Just guessing here.

Secondly I was able to reproduce the problem while decoding your example, using just the google lib, version 3 protobuf definition and a ruby script, so the "problem" is not rooted in logstash but rather in the prototype definition of this. Please note that there are two string-y datatypes used in this definition: bytes and string (for example in field 15). Looking at the datatype specification the only difference between those is that string must contain a valid ascii or utf-8 string, while bytes does not. Now I do not know why the author of that protobuf definition didn't use string for the fields that you expect to contain strings. Maybe they couldn't guarantee that it would really always contain strings and strings only? We can only guess. But, as a matter of fact, the behaviour of the library seems to be doing exactly what it is told to do. It's just that the definition and the intended usage don't fit to each other.

The cleanest way to fix this problem would be to change the protobuf definition for both consumer and producer to set this to string instead of bytes. I understand that this is out of scope for you though. Sadly I don't know how to help you here either, since the problem is provably not caused by logstash and I am not under the impression that this is even a bug in the google lib. My recommendation would be do continue using the modifications that you already implemented.

jake2184 commented 6 years ago

The .pb file I am using already has the Dummy variables, and matches completely the one you provided.

IngaFeick commented 6 years ago

@jake2184 👍 Did you see my explanation above on why this bug is actually desired behaviour?

jake2184 commented 6 years ago

Yes, I think what is not clear perhaps is the original issue topic doesn't matter so much, of being able to turn the bytes into an IP address - the Ruby code I wrote fixed that.

My current problem is the incessant errors being logged, and the fact when I upgrade to proto3 as described I cannot get any messages to decode.

IngaFeick commented 6 years ago

@jake2184 it has just been pointed out in https://github.com/logstash-plugins/logstash-codec-protobuf/issues/26 that we seem to be having a documentation issue when it comes to the kafka plugin specifically. You have to set the deserializer classes in your kafka config (outside the protobuf config) like so:

input {
    kafka {
        key_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
        value_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
    }
}

Could you please test if that solves the issue? If this solves the problem: a thousand times sorry for not noticing this earlier. I'm using a different/custom version of the kafka plugin, so I didn't have to set this and never ran into the issue. 😬

jake2184 commented 6 years ago

As my config uploaded above shows, I am not using a kafka input. I am using tcp input, using the protobuf codec.

IngaFeick commented 6 years ago

@jake2184 my apologies. My mind was elsewhere :)

ZAZmaster commented 5 years ago

@jake2184 hi, did you manage to collect logs from pdns?