Closed redwheelbarrow closed 8 months ago
Not sure this is something that can be addressed but l will drop some notes from my research on this topic.
First of all, there's no 64-bit signed integer in JSON. The JSON RFC doesn't specify a limit on the number precision but it sort of implies implementations should use 64-bit float for interoperability. That would make the range of a valid integer to be from -(2^53)+1
to (2^53)-1]
.
9_223_372_036_854_775_807 (max signed int64)
9_007_199_254_740_991 (2**53-1)
2**53-1
also matches Number.MAX_SAFE_INTEGER
from Javascript so one can argue that Fauxton supports the "safe" range for JSON numbers as per the RFC.
I can't speak for the CouchDB implementation but it doesn't seem to impose a limit on number values as I was able to add the ridiculously long value below. That means it's treating the JSON as a sequence of characters which is exactly what it is, without converting to any specific machine representation for numbers.
{"_id":"doc9","_rev":"1-25ce58eaae037addc94b1fcb06385f5e","n":92233720368547769995425325439238473892174983721894738921478392714893721892}
But there's a catch, if you try to use a map/reduce view, you're faced with the same loss of precision since it goes through the JS engine. For instance, using the the view below
function (doc) {
if (doc.n) {
emit(doc._id, doc.n);
}
}
I get these values:
{"total_rows":5,"offset":0,"rows":[
{"id":"doc2","key":"doc2","value":2132143},
{"id":"doc6","key":"doc6","value":9223372036854776000},
{"id":"doc7","key":"doc7","value":9223372036854778000},
{"id":"doc8","key":"doc8","value":9007199254740991},
{"id":"doc9","key":"doc9","value":9.223372036854777e+73}
]}
where
{"_id":"doc6","_rev":"1-6292f628aa9e691f51018f0cf1953e37","n":9223372036854776807}
{"_id":"doc9","_rev":"1-25ce58eaae037addc94b1fcb06385f5e","n":92233720368547769995425325439238473892174983721894738921478392714893721892}
So my take is that there's an implicit limit to what numbers you can store in CouchDB so you don't run into loss of precision, and that limit matches what Fauxton supports.
All that said, the potential for data loss is still there. In theory, Fauxton could be updated to treat JSON as string only and never parse the value as a object, but the same would have to be true for any JS dependencies in use. All in all, it's a hard ask, and I'll leave it at that for now.
Agree with @Antonio-Maranhao. Any json numbers passed through a JS environment, browser or indexing JS engine (Spidermonkey currently), will have these issues.
To maintain precision at those sizes, try storing numbers as strings.
Agreeing with both, this is a consequence of the number passing through a Javascript engine, rather than anything inherent to couchdb itself. You can store an integer field in a couchdb document that goes well beyond that (as erlang supports multi-precision ints), but avoiding all JS paths is a bit trickier. a json index would avoid it, as would a built-in reduce.
Also agreeing that if you want numeric precision beyond 64-bit floating point within Javascript you'll need to store them as something other than a JSON number (strings, say), and then use a math library to manipulate them (noting that math.js has a serialization format that's a JSON object with string values).
That means it's treating the JSON as a sequence of characters which is exactly what it is, without converting to any specific machine representation for numbers.
This isn't true. Its just that Jiffy (CouchDB's JSON parser) is capable of handling bignums easily since Erlang has bignum support built in.
This works by detecting when it successfully parses a number (i.e., we found a string that matches JSON grammar) but isn't capable of storing the value in a native type. When that happens, we set a flag and wrap the sub-binary with a tagged tuple that is then processed in Erlang.
The logic in C can be found here:
And then when the parsed JSON value is passed back to Erlang we run this function over it to get real bignums:
https://github.com/davisp/jiffy/blob/master/src/jiffy.erl#L111-L142
That said, if you're working with large numbers you'll want to either ensure that your JSON parser has bignum support, or follow @nickva's advice and store your numbers as strings to be interpreted at the application level.
So in summary:
Gonna go ahead and close this, thank you
In fauxton, the javascript engine is your browser. for couchdb indexes (map/reduce, search, mango) or other endpoints (like _update or validate_doc_update) it will be spidermonkey (the JS engine that firefox uses).
swapping out the js engines won't help (and not possible within your browser afaik), but we never suggested that. Instead store your larger-than-natively-supported numbers in some format that javascript won't break (like strings) and use other javascript to manipulate them (e.g https://mathjs.org/).
Fauxton displays the 64 bit numbers from couchdb incorrectly. Saving the document results in the number being changed to the incorrect/displayed value.
Expected Behavior
The displayed number should be what is in the database
Current Behavior
When saving documents with larger 64-bit numbers, the displayed value when viewing the document in Fauxton is usually rounded to the next 1000. Querying couchdb shows the original number. Saving the document in Fauxton results in changing the number in couchdb to the incorrectly displayed number.
Possible Solution
Not familiar with code base
Steps to Reproduce (for bugs)
Context
Your Environment