Open kmerz opened 4 years ago
+1
I would argue against it. In ES documentation we see exactly following:
Use the text field type if:
The content is human-readable, such as an email body or product description.
You plan to search the field for individual words or phrases, such as the
brown fox jumped, using [full text queries](https://www.elastic.co/guide/en/elasticsearch/reference/current/full-text-queries.html). Elasticsearch [analyzes](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis.html) text fields to return the most relevant results for these queries.
Use a keyword family field type if:
The content is machine-generated, such as a log message or HTTP request information.
You plan to search the field for exact full values, such as org.foo.bar, or partial character sequences, such as org.foo.*, using [term-level queries](https://www.elastic.co/guide/en/elasticsearch/reference/current/term-level-queries.html).
GrayLog by its name is intended for machine-generated log mesages. Why message/full_message is type=text then? Can it be configurable at least?
btw: To get count of similar message it seems that two options right now exists: add in pipeline new field containing hashsum (crc/murmur) of message field and agregate on that new field use custom index mapping with ES multi-field functionality. Not tested those yet.
Expected Behavior
The user should not have the possibility to select the field
message
or other fields from type text with fieldata disabled. Since that will only raise a query exception and will prevent the event definition from working.Current Behavior
A user can select
message
as a field for aggregation (group by or metric) and the event definition is doomed to fail, since it will only throw an exeception.This can lead to 1000s of log messages in elastic search and graylog.
Steps to Reproduce (for bugs)
message
field in a aggregation (cardmessage
)Fielddata is disabled on text fields by default. Set fielddata=true on [message] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead. ElasticsearchException{message=Unable to perform search query:
Fielddata is disabled on text fields by default. Set fielddata=true on [message] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead., errorDetails=[Fielddata is disabled on text fields by default. Set fielddata=true on [message] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.]} at org.graylog.plugins.views.search.elasticsearch.ElasticsearchBackend.checkForFailedShards(ElasticsearchBackend.java:326) at org.graylog.plugins.views.search.elasticsearch.ElasticsearchBackend.doRun(ElasticsearchBackend.java:285) at org.graylog.plugins.views.search.elasticsearch.ElasticsearchBackend.doRun(ElasticsearchBackend.java:82) at org.graylog.plugins.views.search.engine.QueryBackend.run(QueryBackend.java:86) at org.graylog.plugins.views.search.engine.QueryEngine.prepareAndRun(QueryEngine.java:155) at org.graylog.plugins.views.search.engine.QueryEngine.lambda$execute$6(QueryEngine.java:95) at java.util.concurrent.CompletableFuture$AsyncSupply.run$$$capture(CompletableFuture.java:1604) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)