elastic / elasticsearch-java

Official Elasticsearch Java Client
Apache License 2.0
397 stars 228 forks source link

Java client and direct CURL call do not return the same result #843

Open mbarbet opened 2 weeks ago

mbarbet commented 2 weeks ago

Java API client version

8.13.3

Java version

17.0.11

Elasticsearch Version

8.14.1

Problem description

We do a GeoTile Grid aggregation with two average aggregations. The curl call and the client java do not return the same json result for the buckets. Below the curl version response:

{
    "key": "10/455/466",
    "doc_count": 26,
    "avg#avg:AOD565": {
        "value": 0.08212617407692308
    },
    "avg#avg:AOD566": {
        "value": null
    }
}

And the java client version response:

{
    "key": "10/455/466",
    "doc_count": 26,
    "avg#avg:AOD565": {
        "value": 0.08212617407692308
    },
    "avg#avg:AOD566": {
        "value": 0.0
    }
}

All the documents in the 10/455/466 grid have no value for AOD566 property. With CURL the result of the average on this property is null and with the java client the result is 0.0. There is a way to obtain the result of the curl with the java client ?

l-trotta commented 2 weeks ago

Hello! The Java client defaults to zero when deserializing aggregation results because of a design choice that was made in order to avoid boxing numbers, which could increase memory usage substantially; we currently don't plan to change this. To check whether the result is actually zero or if the field is missing like in this case it's possible to use the other response fields (for example source).

l-trotta commented 1 week ago

Hello again! For most aggregation responses where the java client defaults null values to 0.0, it's possible to disambiguate the correct value by checking the doc_count field. In this specific case it looks like this isn't possible, so we're trying to figure out a way to solve this issue without breaking the current implementation. Would a bool isEmpty() method in the aggregation response class work for you? So it would be possible to call it to understand whether the aggregation value of 0.0 is an actual 0 or it's actually null.

mbarbet commented 1 week ago

Hello! Thank you for your feedback and your suggestion. What matters to us is to recover the empty information for each metric in the aggregation. It would be perfect if you added it!