opensearch-project / OpenSearch-Dashboards

📊 Open source visualization dashboards for OpenSearch.
https://opensearch.org/docs/latest/dashboards/index/
Apache License 2.0
1.63k stars 848 forks source link

[BUG] avg and value_count in rollups fail with null_pointer_exception. #6890

Closed asgerjensen closed 2 months ago

asgerjensen commented 2 months ago

Describe the bug

When i include a metric in the rollup, with sum, and value_count and avg, and later on try to use those fields in search, i get an error back for value_count and avg. sum, min, and max seem to work.

To Reproduce Create rollup on populated index Try to query based on avg

Expected behavior Return the average value

OpenSearch Version 2.14.0

Dashboards Version

Plugins Rollup

Screenshots

{
    "rollup": {
        "source_index": "order-history",
        "target_index": "rollup-order-history-dashboard2",
        "schedule": {
            "interval": {
                "period": 5,
                "unit": "Minutes",
                "start_time": "1713420981074"
            }
        },
        "enabled": true,
        "page_size": 200,
        "delay": 0,
        "description": "Desc",
        "continuous": false,
        "dimensions": [
            {
                "date_histogram": {
                    "source_field": "orderDate",
                    "calendar_interval": "1M",
                    "timezone": "Europe/Copenhagen"
                }
            },
            {
                "terms": {
                    "source_field": "customerNumber"
                }
            }
    ],
   "metrics": [
            {
                "source_field": "grandTotal",
                "metrics": [
                    {
                        "avg": {}
                    },
                    {
                        "sum": {}
                    },
                    {
                        "max": {}
                    },
                    {
                        "min": {}
                    },
                    {
                        "value_count": {}
                    }
                ]
            }
      ]
}

(after indexing is complete search like)

{
    "size": 0,
    "query": {
        "bool": {
            "must": [
                {
                    "term": {
                        "customerNumber": "123456"
                    }
                },
                {
                    "range": {
                        "orderDate": {
                            "gte": "now-16M"
                        }
                    }
                }
            ]
        }
    },
    "aggregations": {
        "orderDashboardAggregations": {
            "date_histogram": {
                "field": "orderDate",
                "calendar_interval": "1M",
                "time_zone": "Europe/Copenhagen"
            },
            "aggs": {
                "orderTotalSum": {
                    "avg": {
                        "field": "grandTotal"
                    }
                }
            }
        }
    }
}

Returns:

{
    "error": {
        "root_cause": [],
        "type": "search_phase_execution_exception",
        "reason": "",
        "phase": "fetch",
        "grouped": true,
        "failed_shards": [],
        "caused_by": {
            "type": "script_exception",
            "reason": "runtime error",
            "script_stack": [
                "sum += a[0]; ",
                "^---- HERE"
            ],
            "script": "double sum = 0; double count = 0; for (a in states) { sum += a[0]; count += a[1]; } return sum/count",
            "lang": "painless",
            "position": {
                "offset": 54,
                "start": 54,
                "end": 67
            },
            "caused_by": {
                "type": "null_pointer_exception",
                "reason": "Cannot invoke \"Object.getClass()\" because \"receiver\" is null"
            }
        }
    },
    "status": 400
}

Host/Environment (please complete the following information):

Docker / Postman

Additional context

Seems related to: https://github.com/opendistro-for-elasticsearch/index-management/issues/451 which was never addressed, but does propose a fix.

In my data, its not entirely unlikely, that for a given customer there will be months where he has no orders, and some buckets will therefore be empty based on the range query for date.

asgerjensen commented 2 months ago

Closed, moved to index-management repo, sorry.