When I query the database with a MATCH operation, and this MATCH includes a LIMIT, still all the leaf nodes are processed server side, but they are not returned.
This means the result is correct in the end, but when there are parent nodes with many childs, query time is huge.
The strange thing is that queries performed in the OrientDB frontend do not have this issue, but the same query performed with Ostico/PhpOrient does have this problem.
Expected Behavior
The expected behavior is that during a MATCH operation the LIMIT is taken into account, and not all leaf nodes are processed even when the LIMIT is reached. In other words: the behavior should be the same as when MATCH queries are run in the OrientDB frontend.
Current Behavior
The current behavior is that during a MATCH operation all leaf nodes are processed, even when LIMIT is set
It is strange this issue is only present during quering through my scripts using PhpOrient, while queries from the OrientDB frontend work as expected.
Steps to Reproduce
Create a new database
Run CREATE VERTEX V set name="parent"
Run 20 times: CREATE VERTEX V set name="child"
Connect parent + child: CREATE EDGE FROM (SELECT FROM V WHERE name="parent") TO (SELECT FROM V WHERE name="child"). Check your graph, parent should now link to all children.
Create a function, called printText, language javascript, idempotent. Add a parameter text. The function self should contain only: print(text);
Go to the OrientDB frontend, and set your screen such that you see your browser + the OrientDB console.
Run the following query:
select expand(d) from (MATCH {class: V, as:a, where:(name = "parent")}.out(){as:d, where:(printText($currentMatch) != '')} return d LIMIT 1)
Your console should output something like V#9:1{name:child,id:12:in_[#13:0]} v2. You will probably see this line twice.
Now setup a PHP environment with PhpOrient included.
Tinker in your PHP environment or create a script, which runs the same query as above.
You will see that your console will output the line V#9:1.... not twice, but 20 times. This is unexpected behavior, and leads to a long query time.
Context (Environment)
The issue is negatively effecting query time in situations where a lot of child nodes are present.
When I query the database with a MATCH operation, and this MATCH includes a LIMIT, still all the leaf nodes are processed server side, but they are not returned.
This means the result is correct in the end, but when there are parent nodes with many childs, query time is huge.
The strange thing is that queries performed in the OrientDB frontend do not have this issue, but the same query performed with Ostico/PhpOrient does have this problem.
Expected Behavior
The expected behavior is that during a MATCH operation the LIMIT is taken into account, and not all leaf nodes are processed even when the LIMIT is reached. In other words: the behavior should be the same as when MATCH queries are run in the OrientDB frontend.
Current Behavior
The current behavior is that during a MATCH operation all leaf nodes are processed, even when LIMIT is set
Possible Solution
I've been trying to dig into the code. During he query slows down here: https://github.com/Ostico/PhpOrient/blob/9f16c2d943b82dd21fcb19c2f541527e932a064a/src/PhpOrient/Protocols/Binary/Abstracts/Operation.php#L310
when called from here: https://github.com/Ostico/PhpOrient/blob/9f16c2d943b82dd21fcb19c2f541527e932a064a/src/PhpOrient/Protocols/Binary/Abstracts/Operation.php#L127
It is strange this issue is only present during quering through my scripts using PhpOrient, while queries from the OrientDB frontend work as expected.
Steps to Reproduce
CREATE VERTEX V set name="parent"
CREATE VERTEX V set name="child"
CREATE EDGE FROM (SELECT FROM V WHERE name="parent") TO (SELECT FROM V WHERE name="child")
. Check your graph, parent should now link to all children.printText
, languagejavascript
,idempotent
. Add a parametertext
. The function self should contain only:print(text);
Run the following query:
V#9:1{name:child,id:12:in_[#13:0]} v2
. You will probably see this line twice.V#9:1....
not twice, but 20 times. This is unexpected behavior, and leads to a long query time.Context (Environment)
The issue is negatively effecting query time in situations where a lot of child nodes are present.
My setup: