kuzzleio / kuzzle

Open-source Back-end, self-hostable & ready to use - Real-time, storage, advanced search - Web, Apps, Mobile, IoT -
https://kuzzle.io
Apache License 2.0
1.43k stars 123 forks source link

deleteByQuery/truncate on large collections doesn't work #1341

Closed stafyniaksacha closed 4 years ago

stafyniaksacha commented 5 years ago

Description

As reported in #991, deleteByQuery/truncate will get an error from elasticsearch when query will match more than 10k documents.

This limitation is the same as search query.

The problem is mostly on truncate function here, it won't do the requested action on large collections due to delete by query usage: https://github.com/kuzzleio/kuzzle/blob/master/lib/services/elasticsearch.js#L654

Expected Behavior

deleteByQuery -> may be ok to deal with same limitation as search truncate -> should delete all documents on the collection, regardless of number of them

Possible Solution

Aschen commented 4 years ago

Fixed in v2

scottinet commented 4 years ago

Hi @stafyniaksacha

These issues have been partially resolved in Kuzzle v2: collection:truncate has now no limit on how many documents a collection holds, and its performances does not depend on that either. And deleteByQuery behaves a bit better now.

These issues will be completely resolved once this PR goes into production: https://github.com/kuzzleio/kuzzle/pull/1534

This PR makes deleteByQuery limitations controlled by the server configuration file, with its search results returned by the provided query limited by the limits.documentsFetchCount parameter.