Vincit / objection.js

An SQL-friendly ORM for Node.js
https://vincit.github.io/objection.js
MIT License
7.26k stars 640 forks source link

JoinEagerAlgorithm slow in mysql #301

Closed xiaohanzhang closed 7 years ago

xiaohanzhang commented 7 years ago

Hi all, thanks for great work. I just wondering if it is possible to use left join table1 instead of left join (select * from table1) on eager join. seems join on table is much more faster than join on query in mysql. (0.001s vs 2.762s on my computer)

Thanks

koskimas commented 7 years ago

I only ran performance tests on postgresql and it was able to optimize the queries so that there was absolutely no difference between the two syntaxes. The select * from is there only because of missing knex methods for the join builder. If mysql is really that crappy, I quess we'll need to start using raw queries in the JoinEagerAlgorithm or open issues in knex.

xiaohanzhang commented 7 years ago

Thanks for the quick reply. I think mysql doesn't optimize subquery very well subquery-optimiztion Could you point me to the code where objectionjs call the "knex join" ?

elhigu commented 7 years ago

@koskimas was it that sqlite had this same perf issue? Which methods are missing from knex join builder?

koskimas commented 7 years ago

@elhigu The problem is the modifyEager function that needs to work for the eagered relations. The user is allowed to call any of the where and select methods knex has in the modifier function. These function calls need to be applied so that they filter the relation. The easiest way was to join a subquery and apply the wheres and selects on the subquery.

We could map the wheres into on statements for the join, but there are no on equivalent for most of the where methods in the knex join builder. Also we need to solve the select thing somehow. It's a bit tricky since in addition to whatever the columns the user selects for the relation, we need to select all the "join columns" so that we can "unflatten" the result into a tree structure.

So it's not just the missing methods, but also a bunch of other stuff.

fl0w commented 7 years ago

MySQL is (was?) notoriously terrible with sub queries. If I remember correctly, it got some improvements in 5.7. @xiaohanzhang what version are you using?

xiaohanzhang commented 7 years ago

@fl0w I just tried on 5.7, it dose much faster. I guess this issue is not very important then. But unfortunately, our server is still running on 5.5, and I don't think we will upgrade it any time soon.

koskimas commented 7 years ago

@xiaohanzhang How much faster? Is the 5.7 speed comparable to the non-subquery join?

xiaohanzhang commented 7 years ago

@koskimas There is no noticeable difference on my query in 5.7. (Simple left join with fk, no additional filter on subquery).

jeff-kilbride commented 7 years ago

It's nice to hear MySQL may have fixed this in 5.7. Unfortunately, I'm running on Amazon RDS and they only support 5.6, right now. Does anyone know how MariaDB performs with subqueries? I'm working with a relatively small dataset, so it hasn't been a problem, but I can see it growing in the future. The subquery in the select statements is the only thing holding me back from using the JoinEagerAlgorithm throughout my code. For now, I've been doing a lot of GROUP_CONCAT stuff and manually parsing it.

I was also surprised that knex doesn't support the ON clause very well. Seems like something that should be added at that level, rather than worked around in objection. I would be happy to support any issues you may open in the knex repo.

If I haven't said it enough, I want you to know that I Iove objection and appreciate all the work you guys have put into it! I've recommended it several times...

fl0w commented 7 years ago

As a note; I would un-label this as bug, as this is more of a MySQL thing, and possibly a documentation warning notice?

jeff-kilbride commented 7 years ago

Looks like knex 0.12.7 now has better support for ON clauses:

http://knexjs.org/#Builder-on

koskimas commented 7 years ago

Fixing this would be a huge amount of work and since MySql 5.7 can now handle the subqueries, I don't thing fixing this is worth the work.

JoinEagerAlgorithm now only uses the join (select * from foo) when there are wheres or selects specified for the relation using modifyEager or similar. When the join would be join (select * from foo) it is replaced with just join foo.

People using MySql pre 5.7 should do complex eager loading using the default where in based algorithm.