logstash-plugins / logstash-codec-protobuf

Codec plugin for parsing Protobuf messages
Apache License 2.0
26 stars 16 forks source link

Decode multiple classes from the protobuf stream #23

Closed asidorenk03 closed 4 years ago

asidorenk03 commented 6 years ago

Is it possible to decode the different pb definitions from the same datasource?

IngaFeick commented 6 years ago

Hi! As in: you have one kafka topic or whatever datasource and on that event stream you have messages that are encoded in different definitions? No, that's not currently decodable, sorry. I cannot think of an easy way to add this as a feature right now. If it's important but not urgent I could look into it in a week or two, but I recommend not to get your hopes up high.

felipe-conde-benavides commented 6 years ago

Hi, I have the following scenario:

I am receiving several messages types into the same input TCP port, all of them in GPB format (they contain metrics information), so I really need to decode each message type using a different "proto definition". Is there any way to do that?

I have been thinking about to set one different input port per metric type but this is not an acceptable solution from a maintenance point of view (too many ports and to many "tcp-senders" to set up)

IngaFeick commented 6 years ago

Good morning. With the current version of the codec, this isn't possible yet but I'd be happy to implement it if you can think of a way to specify the protobuf class per event. How would you identify which tcp events to decode with which class?

IngaFeick commented 5 years ago

Hi @felipe-conde-benavides Do you still need this or can we close the issue?

Sakorah commented 4 years ago

Hi,

I think the same or similar requirement. I try to decode Cisco streaming telemetry protobufs, but fail to do so...

I configured LS properly and it starts right. In the decoded proto file I have to classes listed - the proto file also lists 2 messages. The first message is the name of the interface, the second message hold the actual counters of that interface.

In the plugin I can only configure one class name, which translates to one of the messages.

jorgelbg commented 4 years ago

@Sakorah Like @IngaFeick mentioned the main issue with implementing this is how to distinguish which class/protobuf definition should be used to decode which message (in a generic enough way), considering that the messages will be mixed.

I'm' not familiar with the Cisco streaming telemetry implementation, but if I had to guess I would say that they encode the message type/length prior to the message using a variable-length field to indicate the message type. This is mentioned in the protobuf documentation as a step required if you want to write multiple messages into the same stream.

Sakorah commented 4 years ago

@jorgelbg I do not know the specifics of how multiple messages are encoded into one stream. From what I saw is that the structure is always the same and defined here: https://github.com/ios-xr/model-driven-telemetry/blob/master/protos/65x/cisco_ios_xr_infra_statsd_oper/infra_statistics/interfaces/interface/latest/generic_counters/ifstatsbag_generic.proto

So there is always first the ifstatsbag_generic_KEYS message and then the ifstatsbag_generic message.

jorgelbg commented 4 years ago

That proto definition is just defining two types of messages, does not necessarily means that it is written like that on the wire.

Taking a closer look at the repository looks like the outer payload on the wire is written using https://github.com/ios-xr/model-driven-telemetry/blob/master/protos/65x/telemetry.proto#L25-L29 with nested raw protobuf messages inside https://github.com/ios-xr/model-driven-telemetry/blob/master/protos/65x/telemetry.proto#L147-L185. The actual proto definition used in the nested message depends on the value of the encoding_path field.

This looks like partial good news, it means that on the stream there are not two mixed protobuf encoded messages, there is only one Telemetry message that has some other dynamic protobuf inside, the specific proto definition used to encode the nested messages depends on the value of encoding_path. Nevertheless, there is an additional issue of dynamically decoding a raw protobuf payload, which is not supported directly by the codec. You could try to after the outer payload is decoded, use a ruby filter to decode the nested message using the right class (by checking the encoding_path field).