siddhi-io / siddhi

Stream Processing and Complex Event Processing Engine
http://siddhi.io
Apache License 2.0
1.53k stars 528 forks source link

Bug related to Pattern + Count? #788

Open rburton opened 6 years ago

rburton commented 6 years ago

Description:

I have the following query:

from every e1=SignalStream[signal == "login"] ->
    e2 = SignalStream[e2.customer == e1.customer AND e2.signal == "register"]<2> -> 
    e3 = SignalStream[e3.customer == e1.customer AND e3.signal == "logout"] 
select e1.client, e1.customer
insert into TriggeredStream

I emit the following events into Siddhi.

customer: rburton, signal: login customer: jim, signal: login customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: logout

I would expect the handler to not be invoked because there are 3 events with rburton + register and I sent the count in the query to be 2.

From my understanding, if I'm using patterns + count, this will allow me to a) enforce an order of events and b) the multiplicity of the events. e.g., register should only happy twice in the sequence above. If it exceeds two, it won't match.

I was going off of Patterns in the documentation

I suspect I'm either misunderstanding how to use Pattern + count, the joining of the sequence of events are wrong in my query, or there might be a bug?

Affected Product Version: Siddhi 4.1.13

rburton commented 6 years ago

Okay, what I uncovered is the following:

If I set the count to <3:3> as in the following:

from every e1=SignalStream[signal == "login"] ->
    e2 = SignalStream[e2.customer == e1.customer AND e2.signal == "register"]<3:3> -> 
    e3 = SignalStream[e3.customer == e1.customer AND e3.signal == "logout"] 
select e1.client, e1.customer
insert into TriggeredStream

Now if I send the following sequence of events, it works properly.

customer: rburton, signal: login customer: jim, signal: login customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: logout

I was under the impression, if I were to send 4 register events, it would not match. But the query above actually does match.

customer: rburton, signal: login customer: jim, signal: login customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: logout

If I try <3> it still matches when register is sent 3 or more times.

I guess my question now is, how do I enforce an exact number of matches? I tried <3:3> <:3> and <3> all of which don't work. According to the documentation,<3> should work.

e.g., With <5> matches exactly 5 events. found on this page

tishan89 commented 6 years ago

Hi, Let me try to explain this behavior. So in counting patters we can match patterns based on arrival order and multiplicity of events. In the above query we are waiting for one login event, then wait for two register events and then wait for single logout event. But We are not instructing specifically to not match pattern incase of third event. Following query can achieve that.

from every e1=SignalStream[signal == "login"] -> e2 = SignalStream[e2.customer == e1.customer AND e2.signal == "register"]<2> -> not SignalStream[customer == e1.customer AND signal == "register"] and e3 = SignalStream[e3.customer == e1.customer AND e3.signal == "logout"] select e1.client, e1.customer insert into TriggeredStream

Here we are specifically omitting occurrence of more than 5 events. Also for documentation please use [1]

[1] - https://wso2.github.io/siddhi/documentation/siddhi-4.0/#pattern

rburton commented 6 years ago

I see what you're saying, but then I'm confused why the count syntax has a for occurrence found in Counting Pattern.

When I read the documentation, how I read it was "You can limit the number of occurrences by doing <2> to restrict it to twice, <1:2> to provide min/max range for number of occurrences and <:3> to cap it off to max number of occurrences." that's also including the filter.

How understand what you're saying, there's no min:max or the max is totally ignored?

rburton commented 6 years ago

I tried to leverage the above example to do the following:

customer: rburton, signal: login customer: rburton, signal: register customer: rburton, signal: register customer: rburton, signal: click customer: rburton, signal: click customer: rburton, signal: click

from every e1=SignalStream[signal == "login"] -> 
           e2=SignalStream[e2.customer == e1.customer AND e2.signal == "register"]<2> -> 
       NOT SignalStream[customer == e1.customer AND signal == "register"] 
       AND e3 = SignalStream[e3.customer == e1.customer AND e3.signal == "click"]<3> 
SELECT e1.client, e1.customer
INSERT INTO TriggeredStream;

When Siddhi parses the above statement, it doesn't like line 4 with <3> to indicate this event should happen 3 times.

org.wso2.siddhi.query.compiler.exception.SiddhiParserException: Error between @ Line: 4. Position: 0 and @ Line: 4. Position: 81. Syntax error in SiddhiQL, extraneous input '<' expecting {'[', '->', '#', SELECT, INSERT, DELETE, UPDATE, RETURN, OUTPUT, WITHIN}.

In this case, I'm not seeing a solution.

tishan89 commented 6 years ago

I am sorry I missed this Richard. Issue with your query is you are waiting for 3 click events after the NOT instruction. Here we are mixing logical and counting patterns. Hence you are getting this parser error. I acknowledge your issue and will use this ticket to try out a proper fix given your requirement is valid. Hence I will reopen this. Once again sorry for overlooking this.

rburton commented 6 years ago

This is one solution for handling it. I also noticed you can't mix Sequence and Patterns in the same query.

from every e1=SignalStream[signal == 'login'] ->
                  e2=SignalStream[e2.customer == e1.customer AND e2.signal == 'register']<2> ->
                 NOT SignalStream[customer == e1.customer AND signal == 'register']
              AND e3=SignalStream[e3.customer == e1.customer AND e3.signal == 'click'] -> 
                  e4=SignalStream[e4.customer == e1.customer AND e4.signal == 'click']<2>
SELECT e1.client, e1.customer
INSERT INTO TriggeredStream;
tishan89 commented 6 years ago

That's an elegant solution. I am seeing whether we can add that restriction just through max count as users perceive max count that way. I am not sure about technical feasibility of this at this point. But will take a look