LogsDB uses stored fields under the hood to deal with malformed field values and as a fallback mechanism in situations where doc values are not available. This behaviour favours user experience at the expense of performance, especially indexing performance (CPU/memory overhead). A significant factor is that supporting ignore_malformed needs some copying of different data structures involved in the parsing logic. These copy operations occur whenever ignore_malformed is enabled for complex field types where the entire field content must be preserved to handle potential parsing issues with nested values. In such scenarios, the copied data structures are used to capture and store malformed values.
We need to evaluate the performance penalty introduced by this logic and understand its impact especially on indexing throughput. Ideally we would like run two tests using the same index mode but with and without ignore_malformed enabled. Note also that standard log templates enable ignore_malformed by default in order to avoid data loss.
The outcome of this benchmarking activity is crucial for planning any necessary actions to mitigate performance issues in the event of unacceptable overhead.
Description
LogsDB uses stored fields under the hood to deal with malformed field values and as a fallback mechanism in situations where doc values are not available. This behaviour favours user experience at the expense of performance, especially indexing performance (CPU/memory overhead). A significant factor is that supporting
ignore_malformed
needs some copying of different data structures involved in the parsing logic. These copy operations occur wheneverignore_malformed
is enabled for complex field types where the entire field content must be preserved to handle potential parsing issues with nested values. In such scenarios, the copied data structures are used to capture and store malformed values.We need to evaluate the performance penalty introduced by this logic and understand its impact especially on indexing throughput. Ideally we would like run two tests using the same index mode but with and without
ignore_malformed
enabled. Note also that standard log templates enableignore_malformed
by default in order to avoid data loss.The outcome of this benchmarking activity is crucial for planning any necessary actions to mitigate performance issues in the event of unacceptable overhead.