Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.41k stars 1.07k forks source link

RawMessage cannot be extended #963

Closed dfch closed 9 years ago

dfch commented 9 years ago

There is an issue when extending the RawMessage class. Though it is possible to extend the class and add some additional properties to it, these properties will not be serialised properly by the Kafka journal. So when the codec later on tries to convert a RawMessage to a Message these additional properties are not available any more. If this is by design (not to be able to extend the class), then this class should be marked as final

RawMessage.java

kroepke commented 9 years ago

What is the use case for adding things to a raw message? Since the work is done on the IO thread it should be done as quickly as possible. We debated adding arbitrary metadata but couldn't come up with a convincing use case so far. If there is something we have missed let us know, please. On Feb 11, 2015 7:27 PM, "d-fens GmbH" notifications@github.com wrote:

There is an issue when extending the RawMessage class. Though it is possible to extend the class and add some additional properties to it, these properties will not be serialised properly by the Kafka journal. So when the codec later on tries to convert a RawMessage to a Message these additional properties are not available any more. If this is by design (not to be able to extend the class), then this class should be marked as final

RawMessage.java https://github.com/Graylog2/graylog2-server/blob/a5cec6ee646044d55789cd516569fb7a56a478fd/graylog2-plugin-interfaces/src/main/java/org/graylog2/plugin/journal/RawMessage.java#L65

— Reply to this email directly or view it on GitHub https://github.com/Graylog2/graylog2-server/issues/963.

dfch commented 9 years ago

We have the following situation where we would have needed that feature:

  1. We created an input filter plugin to accept messages via AMQP and use the graylog built-in transport.
  2. As the built-in AMQP filter only accepts GELF messages but we would expect a different format (especially we need to read the AMQP header information as well, not only the payload of the AMQP message), we extend the RawMessage class to add the header information as some kind of map to the RawMessage object.
  3. When decode() is called on our codec portion of the filter to return a Message object we wanted to use the header information and add them as additional fields to the Message via message.addField(). However during Kafka serialisation of the message only the RawMessage is serialised and all additional properties (from our extended class) are discarded. So the decode method does not see them any more and cannot add them to the Message object.

We found a another way around this, but actually struggled because the class was nor marked as final and thus wrongly presumed to be able to extend the class.

Is there a good or recommended way to achieve the described functionality? Thanks for suggestions! Ronald

bernd commented 9 years ago

We are serializing the RawMessage via protocol buffers. (see raw_message.proto and the generated JournalMessages class) As you already noticed, extending the RawMessage with additional fields in a subclass does not work at the moment.

Protocol buffers support extensions which allows third-party additions to protobuf messages. We might add this in the future to allow extension of the serialized RawMessage objects but we did not look into that in detail.

We will check if we make RawMessage final to avoid confusion in the future.

If I understand correctly, you built your own AMQP transport (to have access to the AMQP header information) and your own codec. So you could serialize the actual message from AMQP and the additional information into a byte[], put it into the RawMessage and parse it again in your codec to add additional fields to the resulting Message. The RawMessage payload is just a byte[] and if you control the codec you can add anything you want. (serialize to JSON, build your own protobuf message type, etc.) No need to extend the RawMessage for that.

As @kroepke said, remember that the RawMessage creation runs in the IO thread and should be fast.

I hope that helps! :smiley:

dfch commented 9 years ago

Hi Bernd, thanks for the feedback. And the way you described is actually the very same approach that we took after we found out that we could not extend the RawMessage. And I can also work with that very well. We just struggled a little bit as the class was not marked as final. So if you added this, it might be clearer (for others as well). And yes, we want our plugin to run as quick as possible (but the header information is crucial for us, so we need to extract it as well). So thanks again for the clarification, Ronald