opensearch-project / anomaly-detection

Identify atypical data and receive automatic notifications
https://opensearch.org/docs/latest/monitoring-plugins/ad/index/
Apache License 2.0
66 stars 73 forks source link

[FEATURE] Flatten result index mapping for visualizing nested objects in Dashboards #1306

Open jackiehanyang opened 2 months ago

jackiehanyang commented 2 months ago

Is your feature request related to a problem? Many values are not flattened, making it difficult to view them on the dashboard. For instance, entity values are nested objects, and features are arrays. The requirement is to reference a feature by name and apply conditions like f1 > 3. Additionally, there is a need to perform terms aggregation on categorical fields. This will require adjustments to the mapping and the addition of new fields in the result index.

What solution would you like? Approach 1. Flattening the result index on the AD backend side, keeping both the original nested fields and the newly added flattened fields. This approach involves update the index mapping everywhere we create/update result index to include the new flattened nested fields. You can find an example of the updated result index mapping in the github issue linked above. Pros:

Cons:

Approach 2. Flattening the result index at AD backend side, but only keeping the flattened fields. This approach is similar to Approach 1. But instead of retaining both nested and flattened fields in the index mapping, we could only keep the flattened fields. Cons:

Approach 3: Flattening the result index mapping on the OpenSearch Dashboard side This approach focuses on flattening the result index mapping on the OpenSearch Dashboard side. However, the challenge is that, unlike the current dashboard that allows customers to access nested objects using standard dot path notation, we need the nested fields values to be part of the do path notation. However, these values are not indexed for fast lookup. After consulting with the dashboard team and researching the issue, it’s unclear whether it’s possible to implement this solution to support aggregation on nested fields. Pros:

Cons:

Approach 4 (PROPOSING): Using ingest processors to update the custom result index with flattened fields. Ingest processors are a core component of ingest pipelines that preprocess documents before indexing. Utilizing an ingest processor would allow us to update custom result index documents with flattened fields before indexing. We can prepare the pipeline for you with a button click on the AD dashboard. After processing, the custom result index would contain both the existing nested fields and the flattened fields. Pros:

Cons:

dblock commented 1 month ago

[Catch All Triage - 1, 2, 3, 4]

jackiehanyang commented 3 weeks ago

After setting up the ingest pipeline to flatten the nested fields, I can see the new flattened fields on the index pattern page. However, on the visualization side, the Field dropdown list is not loading the newly added flattened fields. I have created an issue on the OSD side regarding this matter - https://github.com/opensearch-project/OpenSearch-Dashboards/issues/8722