elastic / package-spec

EPR package specifications
Other
15 stars 70 forks source link

Allow `index.mapping.ignore_malformed` to be set inside index templates #730

Closed kcreddy closed 3 months ago

kcreddy commented 3 months ago

Currently for elasticsearch_index_template, only few properties are allowed to be set under index.mapping: https://github.com/elastic/package-spec/blob/main/spec/integration/data_stream/manifest.spec.yml#L211-L221 We need to allow index.mapping.ignore_malformed to be set inside index templates.

The issue without this property: When a transform is involved, the source datastream (by default) contains index.mapping.ignore_malformed: true. This ingests any malformed documents, for example in: https://github.com/elastic/integrations/issues/9360, a value of 178.21.14.0/23 is ingested into the source datastream's field threat.indicator.ip. However, the transform crashes to index into the destination because (by default) index.mapping.ignore_malformed is false It is not possible to override this value as it follows the same index template definition.

jsoriano commented 3 months ago

@kcreddy thanks for opening the issue. To be sure I understand, the defaults for ignore_malformed in data streams and in transforms are different? If that's the case, maybe we want to always set index.mapping.ignore_malformed to true for integrations? There would be any case where this is not wanted? If we want to always set ignore_malformed to true, we can make this from Fleet.

kcreddy commented 3 months ago

Hey @jsoriano

To be sure I understand, the defaults for ignore_malformed in data streams and in transforms are different?

Yes, that seems to be the case. The source datastream and destination index has different ignore_malformed values.

If that's the case, maybe we want to always set index.mapping.ignore_malformed to true for integrations? There would be any case where this is not wanted?

I think having a same value should help fix the issue. I agree to changing it to true by default for transform's destination indices. @andrewkroh, do you see any issue changing transform's destination index's index.mapping.ignore_malformed to true? By default it is being set to false and is leading to transform failing.

andrewkroh commented 3 months ago

If we want to always set ignore_malformed to true [for data stream and transform indices], we can make this from Fleet.

I think that is what we should do. The least surprising thing from a developer standpoint would be for any data streams or indices that get created through Fleet integrations to have similar configurations including ignore_malformed.

jsoriano commented 3 months ago

Agree, opened issue in Kibana repo https://github.com/elastic/kibana/issues/179445. And closing this one.

Thanks!