Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.31k stars 1.05k forks source link

Bulk export support for search backend #7354

Closed kroepke closed 4 years ago

kroepke commented 4 years ago

What?

The search backend (https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog/plugins/views/search/engine/QueryEngine.java etc) does not support bulk exports right now. Depending on the elasticsearch version and sorting requirements, the implementations might look a bit different.

Why?

Exporting results as CSV or other formats (plain log view with format strings, etc) require iterating over large amounts of data, which cannot be retrieved with deep pagination. The current workaround is #7227 but that is not a good solution due to how Stream selection works now.

Moreover, the API layer should be able to support chunked transport to avoid buffering large amounts of results in the server process, so separate QueryEngine support seems necessary.

Notes

The implementation should support sorting the data, where possible. The implementation does not have to support aggregations, as those are blocking on the elasticsearch side anyway, and are not even supported in "scroll" requests.

Your Environment

kroepke commented 4 years ago

Related to some older issues:

Things that should go away after reimplementing:

alex-konn commented 4 years ago

Clarified with @kroepke:

dennisoelkers commented 4 years ago

FIxed in #7709.