Closed stockf closed 1 year ago
Currently the interpreter will create an exact copy of the message, but that message will receive a new ID.
That means that from the interpreters view that message has not been processed on the streams it is routed to yet, so it will flow through the pipeline as newly seen message.
The intention for clone_message
is that the rule then proceeds to change the clone message, leading to another path through the pipelines.
What is the solution to this issue? What workaround can be done to send a message from one stream to another via pipeline?
As far as I remember, one workaround is to add a field e.g. "do_not_process" before calling clone_message
and adding an if condition at the beginning that checks for the field. Like that (untested):
rule "clone only once"
when
$message.do_not_process == false
then
set_field("do_not_process", true);
debug("cloning message");
clone_message();
end
We endet up not using clone_message
at all.
That didn't work. On adding clone_message, the throughput on that rule increased exponentially leading to 100% heap size usage and Graylog crashed.
@graylog2 Team, this seems to be a serious bug.
Maybe you have to use has_field
. I really haven't written a rule in months. Try to test it without actually cloning the message and use the simulator to debug. The bug has existed for more than a year, I doubt that the maintainers will fix this now.
I have the below rule connected to Stream1 and writing the message to Stream2:
rule "sampleRule"
when
to_string($message.field) == "somevalue"
then
let msg = clone_message();
route_to_stream("Stream2", "", msg);
end
Is there anything wrong that you see?
The moment you call clone_message() it will put the cloned message through the same pipeline. So you need to alter the message first. Maybe like this:
rule "sampleRule"
when
NOT has_field("cloned_message")
then
set_field("cloned_message", true)
let msg = clone_message();
remove_field(msg, "cloned_message")
route_to_stream("Stream2", "", msg);
end
Still guessing, though.
I tried your rule. But the same issue occurred.
To clarify the issue and provide potential solutions:
clone_message will always add the cloned message to the original message's streams. This can potentially cause an endless loop if this is not taken into account in the rules.
Here are two possible, verified workarounds:
rule "cloneMessage"
when
true
then
let newmsg = clone_message();
remove_from_stream(name: "Original Stream", message: newmsg);
route_to_stream(name: "New Stream", message: newmsg);
end
rule "cloneMessage"
when
not has_field("cloned")
then
let newmsg = clone_message();
set_field(field: "cloned", value: "true", message: newmsg);
end
The fact that messages are added to the original message's streams is not documented and unfortunate since this will easily trigger the issue.
One possibility would be to change clone_message to not copy the original message's streams. To keep backwards compatibility, this behavior probably shouldn't be changed to not break existing rules.
Next to more comprehensive documentation, an additional parameter to the function, either as a boolean addToOriginalStreams
, defaulting to true or a string streams
depicting a comma-separated list of streams to add the cloned message to would make the functionality clearer.
Avoiding endless loops is a bit more tricky, but something which should be addressed to avoid Graylog from becoming unresponsive. Endless loops can either be triggered by clone_message (relatively easy) or by create_message (by accident) if the created or cloned message is sent back into the original pipeline without treating them differently from original messages. There are a couple of ways to address this issue:
This could be handled by adding context information to the generated messages and evaluating that in the processor. One drawback is that this will potentially break existing rules which have implemented e.g. one of the workarounds above.
If during pipeline execution, messages are generated, they will be sent into the processing pipeline after the original messages' execution. This can be easily detected and a counter could be used to stop the processing of generated messages after a certain number of "generation cycles". One problem here is setting a sensible initial number for this value. In theory, rules could be used to e.g. recursively generate messages. In this case the number would have to be relatively high. In most cases, however, a low number would probably suffice. Of course, this number could also be made configurable, but a default value for existing installations needs to be found. Another topic to be discussed is what to do with the generated messages in case of breaking the loop. Should these be stored or should a processing error be thrown and all messages (or all generated messages except the original) be rejected. How should the user be informed? Using a system notification or a simple log warning? A system notification should then only be shown for the first message to cause the issue until it is cleared by the user, if possible.
Already discussed these possibilities with @thll , but it would be great if we could decide on which approach we would like to take.
In Graylog 3.0 using a rule that contains the clone_message() function creates an endless loop like described in #4722.
5284 is related but I believe that this is a bug in the code not in the documentation.
Expected Behavior
The debug output ist displayed once -> The rule is evaluated just once
Current Behavior
The debug output gets repeated over and over until Graylog becomes completely unusable
Possible Solution
It seems that clone_message() makes the cloned message immediately enter the same stream again and is therefore evaluated in an infinity loop.
Steps to Reproduce (for bugs)
Context
We are trying to forward a long message to a different stream, then shorten the message field to display in the same stream. Since we want the shortened field only in the original message and not in the one forwarded, we forward a cloned message.
Your Environment