codelibs / elasticsearch-dataformat

Excel/CSV/BulkJSON downloads on Elasticsearch.
Apache License 2.0
145 stars 36 forks source link

Scalability / Query Size limitations? #39

Open sephcoster opened 7 years ago

sephcoster commented 7 years ago

Hello!

Based on your previous experience with this plugin can you speak to the kind of scale required to support somewhat sizable exports, and the kind of size of data that it supports well?

Does it scale in a similar manner to elasticsearch based on nodes or are there different considerations / limitations I should think about? Is dumping 100,000+ items going to bring my cluster to a halt?

I'm considering dataformat as an option for a data export feature but want to make sure I am able to scale my infrastructure to support the number of requests we're expecting. We'll conduct some performance testing before going to production but appreciate any advice you might have to get us started.

Thank you so much for this awesome library and appreciate your thoughts!

Question

marevol commented 7 years ago

For the current version, download size depends on heap size of elasticsearch. If you want to generate a file which is unlimited size, use file request parameter(ex. file=/tmp/filename.csv).

anujtmr9 commented 7 years ago

I m working on same problem. We are exporting at least 100,000 every time. What u need to take care is memory allocation for the plugin so that it can write the result at server and dump to desired location

sephcoster commented 7 years ago

This is great information @marevol and thank you for being so responsive. Also appreciate your insight @anujtmr9 and will check to make sure we have enough memory freed up.

sephcoster commented 7 years ago

This should give us enough to go on for now - please feel free to close this issue.