elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
68.58k stars 24.36k forks source link

Use Well Known Binary Format for `geo_point` synthetic source #108990

Open salvatore-campagna opened 1 month ago

salvatore-campagna commented 1 month ago

Description

Currently geo_ponit doc values use quantisation when storing doc values. As a result, reconstructing documents in synthetic source using doc values results in accuracy loss and in the reconstructed document not matching the original source. We can fix the accuracy issue by using the WKB format and consider giving users a storage versus accuracy tradeoff by taking advantage of the store option.

The idea is to make store: true the default and using WKB for the stored field. When store is true we can reconstruct the document in synthetic source by using the stored field without accuracy loss. When store: false is used we can use the. quantized representation in doc values by trading accuracy for storage reduction.

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-storage-engine (Team:StorageEngine)

lkts commented 1 month ago

Same as #108981 ?