opensearch-project / index-management

🗃 Automate periodic data operations, such as deleting indices at a certain age or performing a rollover at a certain size
https://opensearch.org/docs/latest/im-plugin/index/
Apache License 2.0
52 stars 107 forks source link

[BUG] querying rollup on avg and value_count throws null_pointer_exception #1179

Open asgerjensen opened 1 month ago

asgerjensen commented 1 month ago

Describe the bug

When i include a metric in the rollup, with sum, and value_count and avg, and later on try to use those fields in search, i get an error back for value_count and avg. sum, min, and max seem to work.

To Reproduce Create rollup on populated index Try to query based on avg

Expected behavior Return the average value

OpenSearch Version 2.14.0

Dashboards Version

Plugins Rollup

Screenshots

{
    "rollup": {
        "source_index": "order-history",
        "target_index": "rollup-order-history-dashboard2",
        "schedule": {
            "interval": {
                "period": 5,
                "unit": "Minutes",
                "start_time": "1713420981074"
            }
        },
        "enabled": true,
        "page_size": 200,
        "delay": 0,
        "description": "Desc",
        "continuous": false,
        "dimensions": [
            {
                "date_histogram": {
                    "source_field": "orderDate",
                    "calendar_interval": "1M",
                    "timezone": "Europe/Copenhagen"
                }
            },
            {
                "terms": {
                    "source_field": "customerNumber"
                }
            }
    ],
   "metrics": [
            {
                "source_field": "grandTotal",
                "metrics": [
                    {
                        "avg": {}
                    },
                    {
                        "sum": {}
                    },
                    {
                        "max": {}
                    },
                    {
                        "min": {}
                    },
                    {
                        "value_count": {}
                    }
                ]
            }
      ]
}

(after indexing is complete search like)

{
    "size": 0,
    "query": {
        "bool": {
            "must": [
                {
                    "term": {
                        "customerNumber": "123456"
                    }
                },
                {
                    "range": {
                        "orderDate": {
                            "gte": "now-16M"
                        }
                    }
                }
            ]
        }
    },
    "aggregations": {
        "orderDashboardAggregations": {
            "date_histogram": {
                "field": "orderDate",
                "calendar_interval": "1M",
                "time_zone": "Europe/Copenhagen"
            },
            "aggs": {
                "orderTotalSum": {
                    "avg": {
                        "field": "grandTotal"
                    }
                }
            }
        }
    }
}

Returns:

{
    "error": {
        "root_cause": [],
        "type": "search_phase_execution_exception",
        "reason": "",
        "phase": "fetch",
        "grouped": true,
        "failed_shards": [],
        "caused_by": {
            "type": "script_exception",
            "reason": "runtime error",
            "script_stack": [
                "sum += a[0]; ",
                "^---- HERE"
            ],
            "script": "double sum = 0; double count = 0; for (a in states) { sum += a[0]; count += a[1]; } return sum/count",
            "lang": "painless",
            "position": {
                "offset": 54,
                "start": 54,
                "end": 67
            },
            "caused_by": {
                "type": "null_pointer_exception",
                "reason": "Cannot invoke \"Object.getClass()\" because \"receiver\" is null"
            }
        }
    },
    "status": 400
}

This seems to stem from RollupUtils.kt

Host/Environment (please complete the following information):

Docker / Postman

Additional context

Seems related to: https://github.com/opendistro-for-elasticsearch/index-management/issues/451 which was never addressed, but does propose a fix. If someone can nod that this fix is reasonable, i will happily do the PR work to get it in.

In my data, its not entirely unlikely, that for a given customer there will be months where he has no orders, and some buckets will therefore be empty based on the range query for date.

dblock commented 3 days ago

[Catch All Triage - Attendees 1, 2, 3, 4]

Looks like a bug, thanks for opening.