Open yozlet opened 8 years ago
I just discovered the same thing. For quick reference, here's a sample query:
@siruguri would you have time to take a look at what's going on here?
I have to re-install elastic search because I got a new laptop, and remind myself how to load up some dummy data... I can take a whirl at that this afternoon, and hopefully at least point out what the deal is... not sure if I can get around to fixing/testing it in the next few days...
On Wed, Feb 17, 2016 at 2:19 PM, Sarah Allen notifications@github.com wrote:
@siruguri https://github.com/siruguri would you have time to take a look at what's going on here?
— Reply to this email directly or view it on GitHub https://github.com/18F/open-data-maker/issues/278#issuecomment-185432915 .
in controllers.rb, line 75, the .to_json
calls .to_s
on each item in
the hash data
. Floats are BigDecimals; so to_s makes them strings which I
think is the behavior of BigDecimal
, whereas the integers are Fixnums
(maybe?).
So the solution would be either to intercept the to_s
if possible in
JSON
, or post-process the output.
I am not too sure when I can get to this, maybe next week? I know I also have a failing test request on another issue.
On Wed, Feb 17, 2016 at 2:27 PM, Sameer Siruguri siruguri@gmail.com wrote:
I have to re-install elastic search because I got a new laptop, and remind myself how to load up some dummy data... I can take a whirl at that this afternoon, and hopefully at least point out what the deal is... not sure if I can get around to fixing/testing it in the next few days...
On Wed, Feb 17, 2016 at 2:19 PM, Sarah Allen notifications@github.com wrote:
@siruguri https://github.com/siruguri would you have time to take a look at what's going on here?
— Reply to this email directly or view it on GitHub https://github.com/18F/open-data-maker/issues/278#issuecomment-185432915 .
So the writer of BigDecimal
actually wanted the default implementation to be that floating points are rendered as strings by the to_json
method, to reduce the possibility of (small) errors during de/re-serialization of data. As far as I could figure, it seems like monkey-patching BigDecimal
is the only solution, with the caveat that it will produce this corner-case error.
Should I do that? Should we instead let the numbers be strings so that the client system can decide how to re-cast to their version of the floating-point type?
Sorry to be so slow, @siruguri! Ugh, this is a nasty one, and I'm really torn. Providing floats as strings feels like giving a worse user/developer experience in return for a level of accuracy that - in all the existing use cases, anyway - really isn't needed. On the other hand, Open-data-maker is meant to be domain-agnostic, and I can't speak for the needs of future data sets and their users. (On the other other hand, I don't know what ES uses for its floats internally; the accuracy may already have been compromised before the stats were emitted.)
My inclination is to monkey-patch in favour of a better user experience. What does everyone else think?
I'm open to either solution though I think I'd rather un-string the value in the open-data-maker code than monkey-patch BigDecimal. I'll work on this today.
On Fri, Mar 4, 2016 at 12:18 PM, Yoz Grahame notifications@github.com wrote:
Sorry to be so slow, @siruguri https://github.com/siruguri! Ugh, this is a nasty one, and I'm really torn. Providing floats as strings feels like giving a worse user/developer experience in return for a level of accuracy that - in all the existing use cases, anyway - really isn't needed. On the other hand, Open-data-maker is meant to be domain-agnostic, and I can't speak for the needs of future data sets and their users. (On the other other hand, I don't know what ES uses for its floats internally; the accuracy may already have been compromised before the stats were emitted.)
My inclination is to monkey-patch in favour of a better user experience. What does everyone else think?
— Reply to this email directly or view it on GitHub https://github.com/18F/open-data-maker/issues/278#issuecomment-192448370 .
In fact, now that I think about it,
Providing floats as strings feels like giving a worse user/developer experience in return for a level of accuracy that - in all the existing use cases, anyway - really isn't needed.
sounds equiv to, "let's cast the BigDecimal
object via .to_f
" Should I do that instead? Thoughts? Then .to_json
will continue to produce non-stringified numbers.
@siruguri Yes, that sounds like the right option to me. Thank you!
Example:
... whereas what we get when querying ES directly is:
I initially thought this was something to do with Ruby mis-parsing scientific notation, but now I suspect it's something in our type-detection output code.