cloudant-labs / clouseau

Expose Lucene features as an erlang-like node
Apache License 2.0
58 stars 32 forks source link

feat: uplift Lucene to 4.10.4 #74

Closed pgj closed 1 year ago

pgj commented 1 year ago

Update Clouseau to use the latest available Lucene version from the 4.x series. This change brings many bug fixes while it maintains the backward compatibility for the existing indexes. 4.10.4 has 45 API changes and 116 bug fixes since 4.6.1, for the details see the change log.

pgj commented 1 year ago

It passes the make elixir-search tests, but many tests fail when running with make mango-test:

COUCH_USER=adm COUCH_PASS=pass .venv/bin/python3 -m nose2  ... ........s.ss.s.s.............s.ss.s.s.......F.FF.s.ss.s.s........................................................................................................................................FF...................s...................................................FF...............................................................................F.....FF.......
======================================================================
FAIL: test_elem_match (03-operator-test.OperatorTextTests.test_elem_match)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/03-operator-test.py", line 48, in test_elem_match
    self.assertEqual(len(docs), 1)
AssertionError: 0 != 1

======================================================================
FAIL: test_eq_null_does_not_include_missing (03-operator-test.OperatorTextTests.test_eq_null_does_not_include_missing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/03-operator-test.py", line 113, in test_eq_null_does_not_include_missing
    self.assertUserIds(user_ids, docs)
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/03-operator-test.py", line 22, in assertUserIds
    self.assertEqual(user_ids, user_ids_returned)
AssertionError: Lists differ: [9] != []

First list contains 1 additional elements.
First extra element 0:
9

- [9]
?  -

+ []

======================================================================
FAIL: test_exists_false (03-operator-test.OperatorTextTests.test_exists_false)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/03-operator-test.py", line 106, in test_exists_false
    self.assertUserIds(user_ids, docs)
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/03-operator-test.py", line 22, in assertUserIds
    self.assertEqual(user_ids, user_ids_returned)
AssertionError: Lists differ: [2, 3, 5, 6, 7, 8, 10, 11, 12, 14] != []

First list contains 10 additional elements.
First extra element 0:
2

- [2, 3, 5, 6, 7, 8, 10, 11, 12, 14]
+ []

======================================================================
FAIL: test_gt (06-basic-text-test.BasicTextTests.test_gt)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/06-basic-text-test.py", line 152, in test_gt
    assert len(docs) == 2
           ^^^^^^^^^^^^^^
AssertionError

======================================================================
FAIL: test_gte (06-basic-text-test.BasicTextTests.test_gte)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/06-basic-text-test.py", line 175, in test_gte
    assert len(docs) == 2
           ^^^^^^^^^^^^^^
AssertionError

======================================================================
FAIL: test_limit_bookmark (08-text-limit-test.LimitTests.test_limit_bookmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/08-text-limit-test.py", line 97, in test_limit_bookmark
    self.run_bookmark_check(i)
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/08-text-limit-test.py", line 115, in run_bookmark_check
    assert len(seen_docs) == len(limit_docs.DOCS)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

======================================================================
FAIL: test_limit_field (08-text-limit-test.LimitTests.test_limit_field)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/08-text-limit-test.py", line 23, in test_limit_field
    assert len(docs) == 8
           ^^^^^^^^^^^^^^
AssertionError

======================================================================
FAIL: test_guess_dup_type_sort (09-text-sort-test.SortTests.test_guess_dup_type_sort)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/09-text-sort-test.py", line 79, in test_guess_dup_type_sort
    self.assertEqual(len(docs), 15)
AssertionError: 0 != 15

======================================================================
FAIL: test_number_sort (09-text-sort-test.SortTests.test_number_sort)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/09-text-sort-test.py", line 22, in test_number_sort
    self.assertEqual(len(docs), 15)
AssertionError: 0 != 15

======================================================================
FAIL: test_number_sort_desc (09-text-sort-test.SortTests.test_number_sort_desc)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gaborpali/projects/github/couchdb/src/mango/test/09-text-sort-test.py", line 28, in test_number_sort_desc
    self.assertEqual(len(docs), 15)
AssertionError: 0 != 15

----------------------------------------------------------------------
Ran 362 tests in 26.674s

FAILED (failures=10, skipped=16)

The common reason for the failures is that Clouseau does not return any value. Based on my investigations so far, it works as expected up to 4.8.1, and it starts to fail for 4.9.0 and later versions. I have checked the change log, but none of the API changes directly indicate anything related.

jaydoane commented 1 year ago

Awesome work!

I tested it, and see the same test failures:

Ran 361 tests in 39.905s

FAILED (failures=10, skipped=16)

Also noting that my machine appears to be somewhat slower than yours (my ~40s vs your ~27s), which I wonder if accounts for why the write lock timeout error seems more reliable on your hardware?

FWIW, I also ran these 2 tests 1000 times in a row, and didn't see any errors:

test_disable_index_array_length (10-disable-array-length-field-test.DisableIndexArrayLengthsTest) ... ok
test_enable_index_array_length (10-disable-array-length-field-test.DisableIndexArrayLengthsTest) ... ok
pgj commented 1 year ago

Awesome work!

👍 But, as indicated, I will have to nag @rnewson for hints on why it does not fully work.

I tested it, and see the same test failures:

Thanks for the confirmation!

[M]y machine appears to be somewhat slower than yours [..], which I wonder if accounts for why the write lock timeout error seems more reliable on your hardware?

Yes, that is possible. However, I thought that you have an 2021 M1 Max too.

FWIW, I also ran these 2 tests 1000 times in a row, and didn't see any errors:

You will not, these tests are faulty. And I think I have found the cause for that -- I will soon submit a PR about that for apache/couchdb. Prepare for the facepalm :-D

jaydoane commented 1 year ago

these tests are faulty

Yeah, I've been debugging them a bit today. Seems like the user docs get bulk loaded into the db, but then db info only shows the ddocs, but also no deleted docs. I'm looking forward to the 🤦

pgj commented 1 year ago

Here are the bugs of 10-disable-array-length-field-test with their proposed fixes: https://github.com/apache/couchdb/pull/4610.

pgj commented 1 year ago

One of the queries that return 0 hits starting from 4.9.0:

03-operator-test.test_elem_match, Mango query:

{
  "_id": {"$gt": null},
  "bang": {"$elemMatch": {"foo": {"$gte": 1}, "bam": true}},
}

which translates to:

($fieldnames:_5fid_3a* $fieldnames:_5fid.*) +(+bang._5b_5d.foo_3anumber:[1.0 TO Infinity] +bang._5b_5d.bam_3aboolean:true)

Note that the following queries work at the same time:

($fieldnames:_5fid_3a* $fieldnames:_5fid.*) +bang._5b_5d.foo_3anumber:[1.0 TO Infinity]

and

($fieldnames:_5fid_3a* $fieldnames:_5fid.*) +bang._5b_5d.bam_3aboolean:true
pgj commented 1 year ago

After taking a closer look at the rest of the failing tests cases, I noticed that it is the ranges on numbers that cause the problems. More specifically, ranges with no lower or upper limits, i.e. the use of the Infinity constant. Apparently, Lucene 4.9 and higher do not handle properly if Infinity is written with a capital I.

The previous query works if the Infinity is either written as infinity or infinite, that is:

($fieldnames:_5fid_3a* $fieldnames:_5fid.*) +(+bang._5b_5d.foo_3anumber:[1.0 TO infinity] +bang._5b_5d.bam_3aboolean:true)

If CouchDB is modified according to this observation, make mango-test passes:

diff --git a/src/mango/src/mango_selector_text.erl b/src/mango/src/mango_selector_text.erl
index 1f8609ac2..4b46031eb 100644
--- a/src/mango/src/mango_selector_text.erl
+++ b/src/mango/src/mango_selector_text.erl
@@ -304,11 +304,11 @@ range(gt, Arg) ->
     [<<"{">>, value_str(Arg), <<" TO ", Max/binary, "]">>].

 get_range(min, Arg) when is_number(Arg) ->
-    <<"-Infinity">>;
+    <<"-infinity">>;
 get_range(min, _Arg) ->
     <<"\"\"">>;
 get_range(max, Arg) when is_number(Arg) ->
-    <<"Infinity">>;
+    <<"infinity">>;
 get_range(max, _Arg) ->
     <<"\u0x10FFFF">>.
pgj commented 1 year ago

Unfortunately, this latest fix does not solve the problem, only masks it. That is because using infinity (basically any other text literal) causes Lucene to return all the documents which are then filtered on the CouchDB side by Mango. That is, more documents are returned than required in result and this is not efficient.

I was playing further with the Lucene queries. For the user_docs sample database used by the Mango integration tests, I noticed the following behavior. Documents are there and all of them are returned when no filtering is applied:

*:*

Infinity as upper bound works when the lower bound is set to the minimum age value, e.g.:

age_3anumber:[22.0 TO Infinity]

This can be lowered until 2.0 when Lucene stops returning any hit, e.g.:

age_3anumber:[2.0 TO Infinity] 

When Infinity is not used, even 0.0 (as used in many tests) works as lower bound, e.g.:

age_3anumber:[0.0 TO 100.0]

At the same time, the inclusive ranges do not include the value of the lower bound. That is the following query will not include 73 unless the lower bound is lowered to 72.0:

age_3anumber:[73.0 TO Infinity]
pgj commented 1 year ago

For the reference, here are the commands that I used for the investigation:

# Run in the CouchDB git clone:
MANGO_TESTS_KEEP_DBS=1 make mango-test MANGO_TEST_OPTS="03-operator-test.OperatorTextTests.test_elem_match"
# Start CouchDB manually via dev/run (-a adm:pass -n 1), then:
MANGO_DB=$(curl -sS http://adm:pass@localhost:15984/_all_dbs | jq -r '.[]|select(startswith("mango_test_"))')
# Set the query:
QUERY="age_3anumber:[2.1 TO Infinity]"
# Run the query:
curl -s -X POST -H 'Content-Type: application/json' "http://adm:pass@localhost:15984/$MANGO_DB/_design/cedc01a027213706d7260b5e5b73c70b9233743a/_search/cedc01a027213706d7260b5e5b73c70b9233743a" -d '{"q":"'$QUERY'","include_docs":true}' | jq -r '.'
pgj commented 1 year ago

I think I have the source of the problem. LUCENE-5609 applies an optimization which technically implies a change in the run-time behavior because DoubleRanges should be created with the precision step of 16 instead of 8. I changed it to this value but we may want to lose it completely, because it is optional.

pgj commented 1 year ago

I am closing now in favor of #80. I will leave the changes around and we can revisit the uplift later.