michalmuskala / jason

A blazing fast JSON parser and generator in pure Elixir.
Other
1.6k stars 170 forks source link

Adds float to decimal decoding option #115

Closed tiagonbotelho closed 2 years ago

tiagonbotelho commented 3 years ago

Floats in elixir/erlang are notoriously known for being prone to precision errors, which can be dangerous when dealing with currencies, money, etc.

All issues that Decimal aims to mitigate.

This commit adds the option for floats to be immediately converted to Decimals without going through float conversion.

The approach was heavily inspired on the decoding options for strings and keys.

TODO:

tiagonbotelho commented 3 years ago

Hey everyone 👋 This is still very much a work in progress, but it'd be good to get some validation as to if this is the right approach/path forward 😊 Thanks in advance!

tiagonbotelho commented 3 years ago

@michalmuskala it would be great to get your feedback if possible 😊 Thanks in advance!

tiagonbotelho commented 3 years ago

Thanks for the feedback @michalmuskala 🙌 I agree that this needs some benchmarking as well, I'll be sure to add them ASAP

tiagonbotelho commented 3 years ago

@michalmuskala I've used the Benchmark guidelines to generate some of the requested benchmarks.

The datasets don't have a ton of data that can be converted into floats aside from the JSON Generator one which contains a few latitude/longitude, so I'd use that one as a better indicator

bench_float_decode_options.txt - This benchmark was run running on this current branch but without using the float to decimal decoding option

bench_master.txt - As the name suggests this one was run over the master branch

[bench_using_float_decode_options.txt] - (https://github.com/michalmuskala/jason/files/5404799/bench_using_float_decode_options.txt) - Run the same benchmark but this time using the float: :decimal option

As we can observe from the JSON generator there is around a 1ms slowdown on the 99th percentile when using the conversion to decimal.

Let me know if you have any additional questions or tests you'd like me to run 😊

tiagonbotelho commented 3 years ago

Also I'm not quite sure what to do regarding the failing step in the GitHub checks 🤔 it looks like Decimal isn't available in that environment?

michalmuskala commented 3 years ago

I pushed a commit in 6074b0866b4aac7f17451104cf1381642508ee23 that should fix the dialyzer issue

tiagonbotelho commented 3 years ago

Thanks @michalmuskala but it seems like there might be yet another issue with it? 🤔 Or am I doing something wrong myself? 😅

EDIT: I was definitely doing something wrong myself and I clearly didn't read the error message close enough! Apologies

tiagonbotelho commented 3 years ago

@michalmuskala I believe this PR is ready for another review 😊 Thanks in advance!

klittlepage commented 3 years ago

@tiagonbotelho I created a branch based off on your work to generalize both float and int handling, and addresses #130. If you and @michalmuskala are interested in incorporating that work, I can open a PR against your fork or this one.

tiagonbotelho commented 3 years ago

Sorry for taking so long to respond @klittlepage 😅 I'd absolutely would be up to combine our work together.

This PR has been sitting inactive for quite a bit of time now but it'd be good to see if we can get it reviewed sometime soon

klittlepage commented 3 years ago

Hey @tiagonbotelho I just opened a PR against your branch. I'll close my PR against this repo after that's merged. Thanks!

teamon commented 2 years ago

👋 What is missing to get this merged? How can I help? 😼

bryanhuntesl commented 2 years ago

Anything blocking getting this closed/merged?

michalmuskala commented 2 years ago

The main concern was performance. We'll have to re-run those benchmarks on newer OTP versions since a lot of the compiler and runtime changed. The main limiting factor, though, was me not looking into it, I'll try to set some time aside this week.

bryanhuntesl commented 2 years ago

The main concern was performance. We'll have to re-run those benchmarks on newer OTP versions since a lot of the compiler and runtime changed. The main limiting factor, though, was me not looking into it, I'll try to set some time aside this week.

If you need a (big) box or there's some script that I can just run (I'm searching for it right now) we're happy to help out.

teamon commented 2 years ago

Is there any smart way to run a proper bench against two different versions of the same lib? 🤔

I did a quick run on master and tiagonbotelho:float-decode-options branches and it's definitely inconlusive - https://gist.github.com/teamon/e804e2cc74aeaeb56fcf816b1f9192f7/revisions

michalmuskala commented 2 years ago

Merged manually in 1ffe009b6d9117de05b61e1fbf9ecc6744052588