Closed RevoluPowered closed 5 years ago
AFAIK, decimals are stored as doubles in the index. When they are deserialized, they are of course decimal again, so you only need to worry about storage precision (unless you have business logic in elastic update scripts, then I guess the precision is double)
Thanks for your reply, we need more precision in our storage than double could provide for our monetary calculations, I am curious is there a good way of implementing a way of storing decimal to two ints perhaps, instead of to the double type in es?
I would probably make two properties for the integers with a getter that gets the integer part on serialization. The setters, (only called on deserialization) just sets a backing field. I would mark them [Obsolete("Only for serialization/deserialization", error: true)] to ensure that they are only used to serialize and not from code.
The decimal getter should have some logic to tell whether it should be initialized from the integers (only on first use after deserialization). I would set [Ignore] to avoid any confusion by it being stored and indexed, unless it is needed for sorting where I guess double precision is ok?
Something similar to this?
public class Currency
{
public Currency(decimal value)
{
Value = value;
}
// stored in elastic
public int pence;
public int pounds;
// could be lazy loaded too, to prevent having to do math when returning the value
[Ignore]
public decimal Value{
get
{
// add fractions of a pence to pounds amount in decimal types.
return ((decimal)pence/100m)+ (decimal)pounds;
}
set {
// note: not tested this yet
pounds = Math.Truncate(value); // pounds / dollars
pence = value - Math.Truncate(value); // pence / cents value
}
}
}
Then I could maybe do some further work on this so instead of having to declare 3 vars, I could have a Currency type? which serializes into elastic safely with the two int's?
So in future it uses something nested like this:,
public class Car
{
public Currency price = new Currency( 2500.50m );
}
Shame there isn't a tidier way of doing this, I guess though it's better knowing things are all POCO and no special cases exist, then it can be planned for appropriately.
If you only are storing two decimals (pence), then I am pretty sure that double precision for storage is enough. At least for int pounds. Eg.:
public class Car
{
[Ignore] // do not store without rounding
public decimal Price { get; set; }
[Obsolete("Only for serialization/deserialization, use Price", error: true)]
[Number("Price")] // named Price in elastic for transparency
public decimal StoredPrice {
get { return Math.Round(Price, 2); }
set { Price = Math.Round(value, 2); }
}
}
Even simpler, only store the price in pence, if the floating point representation isn't needed in elastic
public class Car
{
[Ignore] // do not store without rounding
public decimal Price { get; set; }
[Obsolete("Only for serialization/deserialization, use Price", error: true)]
public long PencePrice {
get { return Price*100L; }
set { Price = value/100m; }
}
}
Love the solutions on this thread, we inherit our decimal
serialization behaviour from Json.NET here which has no lossless support:
https://github.com/JamesNK/Newtonsoft.Json/issues/1726
Another solution would be to create a custom json.net serializer that reads and writes longs while also mapping the Price
field as scaled_float
in Elasticsearch:
https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html
This could be something we oughta be considering shipping out of the box support for, thoughts @codebrain @russcam ?
Currency storage is a little bit of a minefield. In most domains you very quickly get into the situation where you need to store both the currency representation and the ISO code.
https://github.com/dgg/nmoneys is a pretty decent resource on how to do this in .NET - I believe the value type used is decimal
(https://github.com/dgg/nmoneys/blob/b9da020206cbdff51ff68bc53f599f68ee1e13df/src/NMoneys/Serialization/Data.net.cs#L39)
I terms of storing in ES... difficult if the data store does not support precise decimal values (128 bit).
I think if we try to work around this in the client we'd very quickly get into situations like:
decimal
value as a double
in ES in order to use numeric-based functions (ranges / aggregations etc).decimal
value as a string
in order to support lossless parsing when exact value is needed or Store decimal
as two integer
values to represent whole units and fractional units - both versions require some kind of parsing after retrieval.I think we run the risk of being a little over-opinionated on our default implementation if we place this inside the client code - however, its a perfect candidate for an example serialiser.
We store huge fractions of a pence unfortunately for what we're doing so 2dp is not enough. we use carbon per kWh, so fractions of a pence is extremely common in the smallest calculation, plus we do the calculation across literally millions of docs. @Henr1k80
I'd love to see elastic fully support currency, obviously it's a headache but probably worth it if you develop a good solution as the code examples above have some overhead which across 1m+ docs would be slow, especially if you're using multiple fields.
@codebrain Perhaps the ISO code wouldn't matter in some systems, could disable it by default and only enable it if someone specifies that the field requires it?
for example in my system we only use GBP for now, anything in USD etc would just be using a currency conversion lookup.
I would envisage currency support in two stages:
decimal
types in ES core.string
and decimal
.I'll ping our ES core developers and see where (and if) decimal
support is on the roadmap.
@RevoluPowered your best option at the moment, if you want exact precision per individual value, is to store as a long
or int
where something like 1000000
= $1.000000
and convert accordingly.
This issue has some context: https://github.com/elastic/elasticsearch/issues/17006. In short aggregations use doubles internally, which makes supporting any data type more accurate than double
probably deceptive in practice since values would be casted to the nearest double when aggregating. So we decided against supporting a decimal
type.
@codebrain yeah I will probably go with the long solution as a workaround for now, it would be nice to see something official though, this wrapper we have above is definitely not following the KISS principle.
@jpountz yeah, I get what you're saying but it might be the case that it needs a re-think, I'm fairly sure clients are storing calculations in C# using decimal, and blindly overlooking that its storing as a double in ES core, I think that could potentially go quite wrong.
Overall though, it's a lot of work so I'll leave it to you guys! Thanks for the feedback on this.
There is a section in the mapping documentation about mapping of .NET types to Elasticsearch field datatypes, with an important admonition about decimal
type. This came about from discussion on https://github.com/elastic/elasticsearch-net/issues/2463
Just had a double precision issue in our aggregations and one of our configurations for our devices.
Is ES Core going to put decimal on the roadmap? if not I might have to make something down the custom serializer route for decimal.
If that is the case is there any way to handle anything other than double in aggregations, I assume it handles long, int but would a custom serializer function in aggs?
@RevoluPowered there isn't a way to plug your own serialization into aggregations in NEST; you can control the complete serialization/deserialization with the low level client, Elasticsearch.Net. Here's an example
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var settings = new ConnectionConfiguration(pool);
var client = new ElasticLowLevelClient(settings);
var response = client.Search<StringResponse>(PostData.Serializable(
new {
size = 0,
aggs = new
{
genres = new
{
terms = new
{
field = "genre"
}
}
}
}
));
var jObject = JsonConvert.DeserializeObject<JObject>(response.Body);
This returns the response from Elasticsearch as a string, then deserializes to a Json.NET JObject
.
I'm going to close this issue for now as there isn't anything further actionable in the client that we can do for decimal serialization.
In C# we use decimal to represent money, I am just wondering if decimal is stored in elastic as two integers or in a decimal format which is okay for use with currency?
Or alternatively, how is the decimal type stored inside ES?
What's the error rate? etc