Ostico / PhpOrient

PhpOrient - Official Php driver based on the binary protocol of OrientDB.
Other
68 stars 37 forks source link

All leaf nodes processed during MATCH query, even with LIMIT set #106

Open mtveerman opened 5 years ago

mtveerman commented 5 years ago

When I query the database with a MATCH operation, and this MATCH includes a LIMIT, still all the leaf nodes are processed server side, but they are not returned.

This means the result is correct in the end, but when there are parent nodes with many childs, query time is huge.

The strange thing is that queries performed in the OrientDB frontend do not have this issue, but the same query performed with Ostico/PhpOrient does have this problem.

Expected Behavior

The expected behavior is that during a MATCH operation the LIMIT is taken into account, and not all leaf nodes are processed even when the LIMIT is reached. In other words: the behavior should be the same as when MATCH queries are run in the OrientDB frontend.

Current Behavior

The current behavior is that during a MATCH operation all leaf nodes are processed, even when LIMIT is set

Possible Solution

I've been trying to dig into the code. During he query slows down here: https://github.com/Ostico/PhpOrient/blob/9f16c2d943b82dd21fcb19c2f541527e932a064a/src/PhpOrient/Protocols/Binary/Abstracts/Operation.php#L310

when called from here: https://github.com/Ostico/PhpOrient/blob/9f16c2d943b82dd21fcb19c2f541527e932a064a/src/PhpOrient/Protocols/Binary/Abstracts/Operation.php#L127

It is strange this issue is only present during quering through my scripts using PhpOrient, while queries from the OrientDB frontend work as expected.

Steps to Reproduce

  1. Create a new database
  2. Run CREATE VERTEX V set name="parent"
  3. Run 20 times: CREATE VERTEX V set name="child"
  4. Connect parent + child: CREATE EDGE FROM (SELECT FROM V WHERE name="parent") TO (SELECT FROM V WHERE name="child"). Check your graph, parent should now link to all children.
  5. Create a function, called printText, language javascript, idempotent. Add a parameter text. The function self should contain only: print(text);
  6. Go to the OrientDB frontend, and set your screen such that you see your browser + the OrientDB console.
  7. Run the following query:

    select expand(d) from (MATCH {class: V, as:a, where:(name = "parent")}.out(){as:d, where:(printText($currentMatch) != '')} return d LIMIT 1)
  8. Your console should output something like V#9:1{name:child,id:12:in_[#13:0]} v2. You will probably see this line twice.
  9. Now setup a PHP environment with PhpOrient included.
  10. Tinker in your PHP environment or create a script, which runs the same query as above.
  11. You will see that your console will output the line V#9:1.... not twice, but 20 times. This is unexpected behavior, and leads to a long query time.

Context (Environment)

The issue is negatively effecting query time in situations where a lot of child nodes are present.

My setup: