toptal / chewy

High-level Elasticsearch Ruby framework based on the official elasticsearch-ruby client
MIT License
1.88k stars 366 forks source link

A particular search term returns a result count, but the wrappers are empty #871

Open abartov opened 1 year ago

abartov commented 1 year ago

I am using Chewy 7.2.7 with ES 7.17.8, Ruby 2.7.5 and Rails 5.2.x. I have recently upgraded from Chewy 6 and ES 6.x Recreated the indexes from scratch, and search is generally working as expected.

However, a particular single-word search term with nothing special about it as far as I can tell does not work as expected: the total results reported are 18, but .each yields [], as does .wrappers. On the other hand, .first does provide a correct result object, and .first(5) provides five correct objects.

I have reviewed the changelog to see if there was something I missed (I did rename the Chewy::Type classes to be just the index etc.), but couldn't find anything that could explain this.

Again, on other search terms, iteration with .each does work, and .wrappers are non-empty.

Is there any explanation that comes to mind for this? Many thanks in advance.

abartov commented 1 year ago

I'm finding more and more search terms where this is the case. (Despite recreating the indexes again.)

What might be the reason a ModelIndex class doesn't populate the .wrappers or work with .each some of the time? (In some search terms, the first page (25 results) is displayed correctly, but any subsequent pages, though they exist according to the total, have no .wrappers or .each, but are correct and accessible via .first(n).

abartov commented 1 year ago

I have upgraded my app to Rails 6.1 and Ruby 3.2.1, but the behavior is unchanged.

Would it be possible for @pyromaniac or one of the other maintainers to offer thoughts on what might be causing this behavior, where results are correct, .total_pages is correct, but .wrappers is [] on page 2 and onwards with Kaminari?

abartov commented 1 year ago

Further finding:

when a search has, say, 285 result pages (at 25 results per page), some pages load fine, including 1, 2, and 285 (the last one), and others with empty .wrappers as above (pages 281, 284). This behavior is consistent for this particular search.

(my index is fresh and up to date, verified with rake chewy:reset)

abartov commented 1 year ago

Unfortunately, the persistence of this issue is forcing me to ditch Chewy in favor of plain Elasticsearch. :(

abartov commented 1 year ago

In a last-ditch effort, I traced into Chewy and into elastic-transport, and have found the problem:

"type"=>"illegal_argument_exception", "reason"=>"The length [1134481] of field [fulltext] in doc[7882]/index[manifestations_1678539023000] exceeds the [index.highlight.max_analyzed_offset] limit [1000000]. To avoid this error, set the query parameter [max_analyzed_offset] to a value less than index setting [1000000] and this will tolerate long field values by truncating them."}

This Elastic error was indeed introduced in 7.x, so explains this happening after my upgrade. It being a function of document size explains the apparent randomness of its occurrence, but consistently for particular search terms.

It seems to me that this error is somehow ignored or missed by Chewy, which just returns those pages without results. So while I will solve my issue by changing the setting, I do believe there is a Chewy issue here as well.

Thoughts?

abartov commented 1 year ago

(for anyone who's run into this and is looking for a quick fix, adding the max_analyzed_offset parameter to the highlight method solves the issue.)

index.highlight(max_analyzed_offset: 999000, fields: {fulltext: {}})