rethinkdb / rethinkdb

The open-source database for the realtime web.
https://rethinkdb.com
Other
26.76k stars 1.86k forks source link

arrayToStream() changes input values supplied via expr() #93

Closed charl closed 11 years ago

charl commented 11 years ago

If you run the following via the Data Explorer:

r.expr([{"id": 267932277667934208},{"id": 267822627970756608}]).arrayToStream().run()

You get:

[ { "id": 267932277667934200 } , { "id": 267822627970756600 } ]

When you'd actually expect to get :

[ { "id": 267932277667934208 } , { "id": 267822627970756608 } ]

Looks like the last digit in both cases was changed to 0.

I am running rethinkdb 1.2.6-0ubuntu1~oneiric.

charl commented 11 years ago

FWIW, I get the expected results when I use the ruby gem:

r.expr([{:id => 267932277667934208},{:id => 267822627970756608}]).array_to_stream.run().map {|m| puts m.inspect}{"id"=>267932277667934208} {"id"=>267822627970756608} => [nil, nil]

srh commented 11 years ago

Javascript uses double precision arithmetic, which means that not all integers greater than 2 to the 53rd power can be exactly represented. The numeric literals in the example above get converted to the nearest double precision representation before any Javascript code can touch them.

If there is a bug here, it's in our Ruby drivers.

srh commented 11 years ago

If there is a bug here, it's in our Ruby drivers.

Specifically, the representation of values is supposed to be JSON-equivalent, so the Ruby drivers are behaving incorrectly if they just let values pass through. The Python drivers might have the same problem, too.

srh commented 11 years ago

Well, JSON numeric representations might be a bit more flexible than Javascript, but the server representation is double-precision, to match what Javascript does. The drivers should behave like they would if you round-tripped the query through the server.

charl commented 11 years ago

Those numbers are my primary keys, generated by a service like snowflake (https://dev.twitter.com/docs/twitter-ids-json-and-snowflake.

So if I am correct, I would need to use a string representation instead when using them with RethinkDB?

Does using a string instead of an integer like this incur a large (space/speed) penalty?

srh commented 11 years ago

Yes, I would recommend using a string instead of an integer. Right now, using a string will not incur a large penalty. (Also, you can expect us to be concerned about their performance in the future, because our own autogenerated keys are strings.)

mlucy commented 11 years ago

The Ruby client does send the numbers to the server in the example he gave, where they're converted to a double. A double seems to be able to hold this particular value accurately:

printf("%lf\n", (double)267932277667934208LLU);                                       
// => 267932277667934208.000000

So I think the error is somewhere in javascript-land.

srh commented 11 years ago

The problem is in Javascript's number printing code.

> var x = 267932277667934208
undefined
> x
267932277667934200
> x % 10
8

This behavior's exhibited in Chrome's and Firefox's JS consoles.

srh commented 11 years ago
> x.toFixed()
"267932277667934208"

So this will be closed.

@charl: I noticed that both the IDs you provided happen to be multiples of 4096. This is the sort of thing that will happen if you take an integer randomly chosen in the [0, 2^64) space and round it to the nearest double precision floating point representation. If these IDs are supposed to be generated as such, it's possible that you might have collapsed distinct IDs into the same double precision value.