apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.46k stars 3.7k forks source link

BUG when sorting (realtime) topN results by postAggregation vs aggregation #6375

Open max-schmidt54321 opened 6 years ago

max-schmidt54321 commented 6 years ago

I'm running a simple query to get the trend of pageviews on recent data (middle manager):

{
   "queryType":"topN",
   "dataSource":"pageviews",
   "dimension":"onlineId",
   "metric": {
    "type": "numeric",
    "metric": "score"
    },
   "granularity": {"type": "duration", "duration": 1900000, "origin": "2018-09-25T09:12:00.000Z"},
   "threshold": 60,
   "intervals":[
      "2018-09-25T09:12:00.000Z/PT30M"
   ],
   "aggregations":[
      {
         "type":"filtered",
         "filter":{
            "type":"interval",
            "dimension":"__time",
            "intervals":[
               "2018-09-25T09:12:00.000Z/PT15M"
            ]
         },
         "aggregator":{
            "type":"longSum",
            "name":"total_old",
            "fieldName":"count"
         }
      },
      {
         "type":"filtered",
         "filter":{
            "type":"interval",
            "dimension":"__time",
            "intervals":[
               "2018-09-25T09:27:00.000Z/PT15M"
            ]
         },
         "aggregator":{
            "type":"longSum",
            "name":"total_new",
            "fieldName":"count"
         }
      }
   ],
   "postAggregations":[
      {
         "type":"arithmetic",
         "name":"score",
         "fn":"-",
         "fields":[
            {
               "type":"fieldAccess",
               "fieldName":"total_new"
            },
            {
               "type":"fieldAccess",
               "fieldName":"total_old"
            }
         ]
      }
   ]
}

The outcomes vary based on which metric I select for sorting. "score" returns wrong values, while "total_new" returns correct results.

Example Results:

Sorted by postAggregation "metric":"score"

"result": [
            {
                "total_old": 344,
                "onlineId": "10264391",
                "total_new": 1424,
                "score": 1080
            },
            {
                "total_old": 134,
                "onlineId": "6372612",
                "total_new": 606,
                "score": 472
            },
            {
                "total_old": 12,
                "onlineId": "10271038",
                "total_new": 263,
                "score": 251
            },
            {
                "total_old": 53,
                "onlineId": "10261042",
                "total_new": 285,
                "score": 232
            },
        ...
]

Sorted by aggregation "metric":"total_new"

"result": [
            {
                "total_old": 1250,
                "onlineId": "10264391",
                "total_new": 1424,
                "score": 174
            },
            {
                "total_old": 421,
                "onlineId": "6372612",
                "total_new": 606,
                "score": 185
            },
            {
                "total_old": 408,
                "onlineId": "10271038",
                "total_new": 360,
                "score": -48
            },
            {
                "total_old": 251,
                "onlineId": "10261042",
                "total_new": 285,
                "score": 34
            },
        ...
]

Notes:

max-schmidt54321 commented 6 years ago

UPDATE Same query works for different arithmetic functions (*, +, /). Only subtraction produces wrong results.

github-actions[bot] commented 1 year ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.