Open elasticmachine opened 7 years ago
Original comment by @skearns64:
++, this will help the vast majority of users.
Original comment by @Harvey-Maddocks:
I have created a snapshot dataset called it_ops_new_raw_snapshot
of an index called it_ops_new_raw
, that will help with testing this behaviour. This contains as a type called logs (which is just the old it_ops_app_logs dataset). Which has as it's mapping
for the message field both a type text
and type keyword
.
Original comment by @droberts195:
This came out of a Slack chat with @peteharverson. It was also something that was brought up on the IRC channel during the recent ML webinar.
We would expect categorization to be applied to log messages, and we would expect people to be storing log messages in
text
fields, because that's what you have to do to make use of Elasticsearch's text search.Additionally, the reverse search terms we generate as an output of categorization can only be used to efficiently search
text
fields.However, at present we make it very hard for people to use a
text
field as theircategorization_field_name
when feeding a job with a datafeed. They have to set the obscure"_source": true
setting in the JSON.I propose the following:
text
is selected as thecategorization_field_name
we automatically set"_source": true
in the datafeed configtext
is selected as thecategorization_field_name
we warn people that it's unlikely to work well with categorization