kuzzleio / kourou

The CLI that helps you manage your Kuzzle application
31 stars 4 forks source link

BUG: Scroll API failing on index:export and collection:export #159

Closed leonardgable closed 2 years ago

leonardgable commented 2 years ago

Using kourou to dump my collection stored on elastic search the kourou cli fails on big collections. The command that are failing are: kourou index:export and kourou collection:export

Expected Behavior

I am supposed to be able to dump the whole collection directly into a JSONL format file

image

Current Behavior

The command kourou collection:export fails while exporting the data to a JSONL format file. It seems the scroll API closes itself before the whole collection has been retrieved.

image image

Possible Solution

I suspect that the sharding starts to fails on elastic search on big collections.

Steps to Reproduce

Try to dump a collection that has more than 17000 documents insides, and export it to a JSONL file using the kourou collection:export command

Context (Environment)

Trying to extract my data stored using kuzzle framework into jsonL files.

Kuzzle version: 2.18.1 Node.js version: 16.15.1 SDK version: 7.10.1

Aschen commented 2 years ago

Hi @leonardgable

This error indicate an expired scroll search

If the network is too slow to send you an entire page of result, then the current page TTL may expire.

You can use the scrollTTL option to increase the TTL with the kourou collection:export command

leonardgable commented 2 years ago

Hi @Aschen

Indeed it solves the issue. Would be a nice trick to extend the default of scrollTTL. Is there any particular reason why it is set with such a short expiring term ?

Aschen commented 2 years ago

Hi @Aschen

Indeed it solves the issue. Would be a nice trick to extend the default of scrollTTL. Is there any particular reason why it is set with such a short expiring term ?

No particular reason, it will be extended in the next version :+1: Thanks for rising the issue