Closed evildecay closed 7 years ago
Hi @evildecay: Which version of cassandra-lucene-index are you using?
Hi @ealonsodb: cassandra-lucene-index-3.10.0
Hi @evildecay:
This is not a bug, this is the expected behaviour. Indeed your are executing two different SinglePartitionReadCommand over the same table.
Indeed the sortField is sorting only one-by-one single-partition results. This is a basic test that confirms this:
If you execute:
SELECT * FROM test WHERE id = '1' and day in ('2017-05-30', '2017-05-31') and
expr(test_lucene, '{
filter:[
{type: "prefix", field: "name", value: "Tom"}
],
sort:[{field:"ctime", reverse:true}]
}') limit 10;
you'll get:
id | day | ctime | name
----+------------+--------------------------+------
1 | 2017-05-30 | 2017-05-30 10:06:45+0000 | Tom5
1 | 2017-05-30 | 2017-05-30 10:06:44+0000 | Tom4
1 | 2017-05-30 | 2017-05-30 10:06:43+0000 | Tom3
1 | 2017-05-30 | 2017-05-30 10:06:42+0000 | Tom2
1 | 2017-05-30 | 2017-05-30 10:06:41+0000 | Tom1
1 | 2017-05-31 | 2017-05-31 10:06:44+0000 | Tom9
1 | 2017-05-31 | 2017-05-31 10:06:43+0000 | Tom8
1 | 2017-05-31 | 2017-05-31 10:06:42+0000 | Tom7
1 | 2017-05-31 | 2017-05-31 10:06:41+0000 | Tom6
And if you execute it reversed
select * from test where id = '1' and day in ('2017-05-30', '2017-05-31') and
expr(test_lucene, '{
filter:[
{type: "prefix", field: "name", value: "Tom"}
],
sort:[{field:"ctime", reverse:false}]
}') limit 10;
you'll get:
id | day | ctime | name
----+------------+--------------------------+------
1 | 2017-05-30 | 2017-05-30 10:06:41+0000 | Tom1
1 | 2017-05-30 | 2017-05-30 10:06:42+0000 | Tom2
1 | 2017-05-30 | 2017-05-30 10:06:43+0000 | Tom3
1 | 2017-05-30 | 2017-05-30 10:06:44+0000 | Tom4
1 | 2017-05-30 | 2017-05-30 10:06:45+0000 | Tom5
1 | 2017-05-31 | 2017-05-31 10:06:41+0000 | Tom6
1 | 2017-05-31 | 2017-05-31 10:06:42+0000 | Tom7
1 | 2017-05-31 | 2017-05-31 10:06:43+0000 | Tom8
1 | 2017-05-31 | 2017-05-31 10:06:44+0000 | Tom9
With this test we can realize that sorting is happening just inside each partition.
I think you can achieve what you want using 'ORDER BY ctime' but it is not posible to combine this with the limit clause:
SELECT * FROM test WHERE id = '1' AND day IN ('2017-05-30', '2017-05-31')
AND expr(test_lucene, '{
filter:[
{type: "prefix", field: "name", value: "Tom"}
],
sort:[{field:"ctime", reverse:false}]
}') ORDER BY ctime ASC;
id | day | ctime | name
----+------------+--------------------------+------
1 | 2017-05-30 | 2017-05-30 10:06:41+0000 | Tom1
1 | 2017-05-30 | 2017-05-30 10:06:42+0000 | Tom2
1 | 2017-05-30 | 2017-05-30 10:06:43+0000 | Tom3
1 | 2017-05-30 | 2017-05-30 10:06:44+0000 | Tom4
1 | 2017-05-30 | 2017-05-30 10:06:45+0000 | Tom5
1 | 2017-05-31 | 2017-05-31 10:06:41+0000 | Tom6
1 | 2017-05-31 | 2017-05-31 10:06:42+0000 | Tom7
1 | 2017-05-31 | 2017-05-31 10:06:43+0000 | Tom8
1 | 2017-05-31 | 2017-05-31 10:06:44+0000 | Tom9
OR
SELECT * FROM test WHERE id = '1' AND day IN ('2017-05-30', '2017-05-31')
AND expr(test_lucene, '{
filter:[
{type: "prefix", field: "name", value: "Tom"}
],
sort:[{field:"ctime", reverse:false}]
}') ORDER BY ctime DESC;
id | day | ctime | name
----+------------+--------------------------+------
1 | 2017-05-31 | 2017-05-31 10:06:44+0000 | Tom9
1 | 2017-05-31 | 2017-05-31 10:06:43+0000 | Tom8
1 | 2017-05-31 | 2017-05-31 10:06:42+0000 | Tom7
1 | 2017-05-31 | 2017-05-31 10:06:41+0000 | Tom6
1 | 2017-05-30 | 2017-05-30 10:06:45+0000 | Tom5
1 | 2017-05-30 | 2017-05-30 10:06:44+0000 | Tom4
1 | 2017-05-30 | 2017-05-30 10:06:43+0000 | Tom3
1 | 2017-05-30 | 2017-05-30 10:06:42+0000 | Tom2
1 | 2017-05-30 | 2017-05-30 10:06:41+0000 | Tom1
Hope this helps
Hi @ealonsodb: Thank you for your analytical solution. It seems that cassandra itself is the mechanism of the decision.
Cluster information
Test sql data
All results should be sorted by query, but it is sorted by day.