Closed kopylovvlad closed 6 years ago
Oh, gosh, this is a good catch.
When you don't specify order
, I think that is up to Neo4j on how things are returned. And if you are making MATCH
es (like with_associations
does for you), then it could affect the order as it's looking for a subgraph pattern.
But when you do specify the order
, I actually see the problem. The limit
, for some reason is getting applied at the beginning of the query. That means that those first 20 are getting retrieved without respect for the order
and then the ORDER BY
is applied later, after it's been through a couple of WITH
clauses
I don't have time to look at it right now, but if you'd like to take a stab I think the code is around here:
@klobuczek Might be able to help / give advice on this as he did a lot of refactoring on that file.
Hello. I did some research with cypher and found out that position of "ORDER BY" is very important for query result.
For example, I wrote 7 different queries and combinated different position for "ORDER BY". Here are queries:
# v1
MATCH (n:Comment)
RETURN n
ORDER BY n.created_at ASC
# v2
MATCH (n:Comment)
WITH n
OPTIONAL MATCH (n)-[post_rel:post]->(post)
WHERE (post:Post)
WITH
n,
[collect(post_rel), collect(post)] AS post_collection
OPTIONAL MATCH (n)-[author_rel:author]->(author)
WHERE (author:User)
WITH
n,
[collect(author_rel), collect(author)] AS author_collection,
post_collection AS post_collection
ORDER BY n.created_at ASC
RETURN n
# v3
MATCH (n:Comment)
WITH n
ORDER BY n.created_at ASC
OPTIONAL MATCH (n)-[post_rel:post]->(post)
WHERE (post:Post)
WITH
n,
[collect(post_rel), collect(post)] AS post_collection
OPTIONAL MATCH (n)-[author_rel:author]->(author)
WHERE (author:User)
WITH
n,
[collect(author_rel), collect(author)] AS author_collection,
post_collection AS post_collection
RETURN
n
# v4
MATCH (n:Comment)
WITH n
ORDER BY n.created_at ASC
RETURN n
# v5
MATCH (n:Comment)
WITH n
ORDER BY n.created_at ASC
OPTIONAL MATCH (n)-[post_rel:post]->(post)
WHERE (post:Post)
WITH
n,
[collect(post_rel), collect(post)] AS post_collection
OPTIONAL MATCH (n)-[author_rel:author]->(author)
WHERE (author:User)
WITH
n,
[collect(author_rel), collect(author)] AS author_collection,
post_collection AS post_collection
RETURN
n, [post_collection,author_collection]
# v6
MATCH (n:Comment)
WITH n
OPTIONAL MATCH (n)-[post_rel:post]->(post)
WHERE (post:Post)
WITH
n,
[collect(post_rel), collect(post)] AS post_collection
OPTIONAL MATCH (n)-[author_rel:author]->(author)
WHERE (author:User)
WITH
n,
[collect(author_rel), collect(author)] AS author_collection,
post_collection AS post_collection
ORDER BY n.created_at ASC
RETURN
n
# v7
MATCH (n:Comment)
WITH n
OPTIONAL MATCH (n)-[post_rel:post]->(post)
WHERE (post:Post)
WITH
n,
[collect(post_rel), collect(post)] AS post_collection
OPTIONAL MATCH (n)-[author_rel:author]->(author)
WHERE (author:User)
WITH
n,
[collect(author_rel), collect(author)] AS author_collection,
post_collection AS post_collection
ORDER BY n.created_at ASC
RETURN
n, [post_collection,author_collection]
And this is a link to report on google docs: https://docs.google.com/spreadsheets/d/1UUgvmexi6hTxRjkp6gFElggidecZtSGPFb_xrOLQ9Zc/edit?usp=sharing Shortly: queries 3 and 5 have wrong order.
And I did mistake with verions of 'neo4j' gem. I have two projects with neo4j. In first project I use '9.0.7' and have bug with order. In second project, I use '9.2.1' and don't have bug.
Hello @kopylovvlad, could you confirm if you still see an ordering problem on the latest version (master)? If so could you please create a PR with a failing spec and I will fix the issue. Thanks for bringing this up.
@kopylovvlad That seems potentially right. If the ORDER BY
is followed by a (OPTIONAL) MATCH
clause, I could see Neo4j not preserving the order.
@klobuczek I would guess that this problem is happening in master
. If he's using 9.2.1 noted in the initial comment, it is not different in ways that I would expect to affect this. I think it's mainly Rubocop stuff:
@cheerfulstoic I am bit confused, it appears to me that @kopylovvlad is saying in his last comment that 9.2.1 does NOT have the ordering bug. That's why I wanted a confirmation. On my end after my last PR to QueryProxyEagerLoading
the order clause is always placed after all clauses generated by with_associations
regardless of the position of order
in the query chain.
Hrmm, I'm not sure ;) . He closed the issue, though, so maybe everything OK
@klobuczek Actually, I just ran into this bug ... it still affects 9.2.4 and also 9.3.0. It's actually very simple to reproduce ... use .order+.limit combined with .with_association. As @cheerfulstoic pointed out ORDER BY is for some reason applied last, even after LIMIT and OPTIONAL MATCHE-es:
But when you do specify the order, I actually see the problem. The limit, for some reason is getting applied at the beginning of the query. That means that those first 20 are getting retrieved without respect for the order and then the ORDER BY is applied later, after it's been through a couple of WITH clauses
A simple example:
irb(main):001:0> >> Events::Event.order(feed_timestamp: :desc).limit(1)
CYPHER
MATCH (result_eventsevent:`Event`)
RETURN result_eventsevent
ORDER BY result_eventsevent.feed_timestamp DESC
LIMIT {limit_1} | {:limit_1=>1}
=> #<QueryProxy [#<Events::LaunchesEvent uuid: "32427119-9e5c-4fe0-9d31-75f812f0fab2", feed_timestamp: Thu, 31 Dec 2099 00:00:00 +0100]>
irb(main):002:0> >> Events::Event.order(feed_timestamp: :desc).limit(1).with_associations(:primary_company)
CYPHER
MATCH (result_eventsevent:`Event`)
WITH result_eventsevent
LIMIT {limit_1}
OPTIONAL MATCH (result_eventsevent)<-[`primary_company_rel`]-(`primary_company`)
WHERE (`primary_company`:`Company`)
WITH
result_eventsevent,
[collect(`primary_company_rel`), collect(`primary_company`)] AS `primary_company_collection`
ORDER BY result_eventsevent.feed_timestamp DESC
RETURN
result_eventsevent,
[`primary_company_collection`] | {:limit_1=>1}
=> #<QueryProxy [#<Events::AcquiresEvent uuid: "a0849f9a-9559-44fa-97b4-1f6ad78d1d54", feed_timestamp: Mon, 09 Oct 2017 07:03:38 +0200]>
Consequently, the results of the two queries differ.
When I using method .with_associations, it changes order! O_o Why does this happen?
Code example (inline, gist, or repo)
Example 1:
Example 2:
Example 3
Runtime information:
Neo4j database version:
neo4j
gem version: neo4j (9.2.1)neo4j-core
gem version: 8.1.3