86166 added the option for object fields in mappings to have a subobjects : false setting. This in turn allows fieldnames with dots to be nested inside the object, without the usual object/scalar clashes that would arise if some scalar fields have more components than others with the same prefix.
For example, subobjects : false makes the following document possible:
Historically it would have been possible to store the document, but only by completely disabling mappings for the metrics object. With subobjects : false the dotted fields under metrics can all have mappings and participate in searches and aggregations.
It is currently possible to create a job that analyses all these fields as the field_name of detector functions.
But supposed instead we also have dotted fields that we want to use as split fields for our job, for example:
Now creation of the job fails if we try to reference multiple fields under attributes, for example:
{
"statusCode": 400,
"error": "Bad Request",
"message": "[x_content_parse_exception: [status_exception] Reason: Fields [attributes.service] and [attributes.service.administrator] cannot both be used in the same analysis_config]: [1:359] [cluster:admin/xpack/ml/job/estimate_model_memory] failed to parse field [analysis_config]",
"attributes": {
"body": {
"error": {
"root_cause": [
{
"type": "status_exception",
"reason": "Fields [attributes.service] and [attributes.service.administrator] cannot both be used in the same analysis_config"
}
],
"type": "x_content_parse_exception",
"reason": "[1:359] [cluster:admin/xpack/ml/job/estimate_model_memory] failed to parse field [analysis_config]",
"caused_by": {
"type": "status_exception",
"reason": "Fields [attributes.service] and [attributes.service.administrator] cannot both be used in the same analysis_config"
}
},
"status": 400
}
}
}
The reason we prevent this is to make it possible to include the fields in our anomaly records.
Instead we could allow jobs to be created with fields like this, and instead change the mappings on our results indices. However, there is a problem here: because results indices can be shared, the results index may already exist with mappings that are incompatible with specifying subobjects : false in the results mappings.
It's tricky to incorporate this validation at the parsing stage, as the parser cannot be expected to check the mappings on an existing index.
We have two options:
Change nothing - subobjects : false will work with anomaly detection jobs if the dotted fields are used as metrics, and this was the intended use case as seen in the PR title of #86166.
Change our analysis_config parser to permit field names that would clash in the results if adding subobjects : false as a results mapping is not possible. Then fail when actually creating the job if creating our desired mappings is not possible. There is already a precedent for failing at this time - if the latest job would push the number of mapped fields in the shared results index over 1000 we fail the job creation at the point of modifying the results index.
86166 added the option for
object
fields in mappings to have asubobjects : false
setting. This in turn allows fieldnames with dots to be nested inside the object, without the usual object/scalar clashes that would arise if some scalar fields have more components than others with the same prefix.For example,
subobjects : false
makes the following document possible:The mappings for such a document could look like this:
Historically it would have been possible to store the document, but only by completely disabling mappings for the
metrics
object. Withsubobjects : false
the dotted fields undermetrics
can all have mappings and participate in searches and aggregations.It is currently possible to create a job that analyses all these fields as the
field_name
of detector functions.But supposed instead we also have dotted fields that we want to use as split fields for our job, for example:
Now creation of the job fails if we try to reference multiple fields under
attributes
, for example:The reason we prevent this is to make it possible to include the fields in our anomaly records.
Instead we could allow jobs to be created with fields like this, and instead change the mappings on our results indices. However, there is a problem here: because results indices can be shared, the results index may already exist with mappings that are incompatible with specifying
subobjects : false
in the results mappings.It's tricky to incorporate this validation at the parsing stage, as the parser cannot be expected to check the mappings on an existing index.
We have two options:
subobjects : false
will work with anomaly detection jobs if the dotted fields are used as metrics, and this was the intended use case as seen in the PR title of #86166.analysis_config
parser to permit field names that would clash in the results if addingsubobjects : false
as a results mapping is not possible. Then fail when actually creating the job if creating our desired mappings is not possible. There is already a precedent for failing at this time - if the latest job would push the number of mapped fields in the shared results index over 1000 we fail the job creation at the point of modifying the results index.