la-haute-societe / craft-elasticsearch

Bring the power of Elasticsearch to your Craft CMS projects
Other
18 stars 14 forks source link

Hard coded Limit throws Exception when having more than 10k Entries in index #19

Open floatingbits opened 2 years ago

floatingbits commented 2 years ago

I'm migrating a slightly bigger website to a new craft system and trying to integrate craft-elasticsearch. After importing the content with multiple 10k entries, I get the following error:

Exception – lhs\elasticsearch\exceptions\IndexElementException An error occurred while running the "Testpage mit Gallery" search query on Elasticsearch instance: Elasticsearch request failed with code 400. Response body: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [20098]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"dev_craft-entries_1","node":"DuVYspTOSQa0sxANaFnpMw","reason":{"type":"illegal_argument_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [20098]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}}],"caused_by":{"type":"illegal_argument_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [20098]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.","caused_by":{"type":"illegal_argument_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [20098]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}}},"status":400}

I'm sure it is due to the "->limit" part in: return self::find()->query($queryParams)->highlight($highlightParams)->limit(self::find()->count())->all(); which anyway seems quite useless to me (don't know if it is technically necessary in some case, though). If it is in deed technically necessary, it should be configurable, not hard-coded to the maximum result length.

nstCactus commented 2 years ago

I haven't found the time to really work on this issue but you might be able to fix the issue by implementing some pagination as described here.

I agree with you, hardcoding the limit isn't the right thing to do!