Bulk export support for search backend

kroepke commented 4 years ago

What?

The search backend (https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog/plugins/views/search/engine/QueryEngine.java etc) does not support bulk exports right now. Depending on the elasticsearch version and sorting requirements, the implementations might look a bit different.

Why?

Exporting results as CSV or other formats (plain log view with format strings, etc) require iterating over large amounts of data, which cannot be retrieved with deep pagination. The current workaround is #7227 but that is not a good solution due to how Stream selection works now.

Moreover, the API layer should be able to support chunked transport to avoid buffering large amounts of results in the server process, so separate QueryEngine support seems necessary.

Notes

The implementation should support sorting the data, where possible. The implementation does not have to support aggregations, as those are blocking on the elasticsearch side anyway, and are not even supported in "scroll" requests.

Your Environment

Graylog Version: 3.2.0
Elasticsearch Version: 5.6/6.8

kroepke commented 4 years ago

Related to some older issues:

"Export as CSV" should follow sorting of the gui #6343
Make export of CSV easier (add all_fields or/and !Fields) #5754
CSV Export not in User Timezone #4523

Things that should go away after reimplementing:

CSV export ignores limit parameter #582
CSV export with limits #1618

alex-konn commented 4 years ago

Clarified with @kroepke:

Ultimately we want to be able to export any search type/widget, but we focus on message tables for now
The export can be triggered by a button on the message table
It can also be triggered from the existing Export button which will allow selecting from a list if multiple exportable widgets are available
The export dialog will have fields (including their order) and sorting settings copied from the message table
The user can override these in the dialog
The API should ideally not change when additional search types are made exportable. Requesting a non-exportable search type should lead to an appropriate error code

dennisoelkers commented 4 years ago

FIxed in #7709.

Graylog2 / graylog2-server