Closed ORBAT closed 6 years ago
So, actually the serialization of Noms' number type is currently arbitrary-precision, so it can represent integers. The only thing is that the API uses float64. What we should do instead is change the API to use some bignum abstraction (maybe optionally?).
I think my original thought was that you could marshal a Noms Number
on to any of the builtin Go numeric types, or some other bignum type.
Thanks for the answer, not to mention the patience.
So, actually the serialization of Noms' number type is currently arbitrary-precision, so it can represent integers. The only thing is that the API uses float64. What we should do instead is change the API to use some bignum abstraction (maybe optionally?).
Ah, OK. So, I've got a few further questions and comments: looking at writeNumber
, the way I take it is that the mantissa of Number
is effectively limited by PutVarInt
(so int64
) if the serialization is to be backwards-compatible (?).
Number
's serialization and it being arbitrarily precise (with a max mantissa of math.MaxInt64
)? How are integers with a width of over the mantissa represented without loss?Number
be backed by big.Float
(plus modifying float64ToIntExp
et al in number_util.go
), + needed changes to the number-related decoders&encoders in marshal
? How would this approach deal with serializing integers over math.MaxInt64
?Personally I'd still like to have a separate integer type (although the serialization could be the same as with Number
), as it'd allow you to be explicit about whether you expect floating point numbers or integers, leading to fewer surprises and runtime problems (with eg. truncation) or outright crashes as the types in noms would map more closely to types used by the application, and something that you decode to an integer will always be "backed" by an integer.
Having a separate type while allowing to have guarantees about truncation could be avoided somewhat with e.g. struct tags (unmarshaling could havenotrunc
,integer
or the like to make truncation an error, for example). What are your views on providing an explicit Int
type vs. some other solution with just Number
?
Ah, OK. So, I've got a few further questions and comments: looking at writeNumber, the way I take it is that the mantissa of Number is effectively limited by PutVarInt (so int64) if the serialization is to be backwards-compatible (?).
My recollection is that the encoding used by PutVarint
can be extended to arbitrary length (it's just that the api uses int64
).
would a better way to approach this be having Number be backed by big.Float (plus modifying float64ToIntExp et al in number_util.go), + needed changes to the number-related decoders&encoders in marshal? How would this approach deal with serializing integers over math.MaxInt64?
There are a number of layers in the design to consider independently:
1/ What happens at the serialization level. How do we serialize numbers? Are there different types for different classes of numbers in Noms, or just one?
2/ What happens at the types.Value
layer? What Golang types represent the Noms number types? Right now it is always float64
. The sort of natural thing in the current design would be to change this to big.Float
. We could extend types.Number
with various conveniences to get in and out of the standard Go numeric types.
3/ What happens at the marshaling layer. Not sure if you're even familiar with this API layer, but there's the marshal
package, which we used extensively in our own applications of Noms and was very convenient high-level API to work with Noms. Currently (I think) you can marshal Noms Number
type onto any Golang numeric type, and it panics if the conversion can't be done cleanly.
Personally I'd still like to have a separate integer type (although the serialization could be the same as with Number)
At what layer are you thinking? We could introduce better support at higher layers for different kinds of numeric types without having to have explicit distinction at the serialization level.
, as it'd allow you to be explicit about whether you expect floating point numbers or integers, leading to fewer surprises and runtime problems (with eg. truncation) or outright crashes as the types in noms would map more closely to types used by the application, and something that you decode to an integer will always be "backed" by an integer.
In our usages, we used the marshal package to deal with this problem and other related ones ("what shape struct am I expecting at this location", etc).
Looking at this all again I'm wondering why we didn't choose to serialize as a rational number, since it can represent more values accurately :-/. Bummer.
@ORBAT I'm going to go ahead and close this in favor of eventually implementing something like: #3806
Hi folks!
To fix #3801, I've started work on adding an integer type backed by
int64
. This PR is more of an RFC, so the feature should not be considered finished.What I'd like is some feedback on whether what I'm doing is
My current idea is to not do
uint*
support, at least for the time being. It might be a good idea to add that while I'm at it, but this has turned out to be a, uh, non-trivial change.Still to do (at least):
TestMapSuite4KStructs/TestChunkCountAndType
. Why did chunk count go from expected 11 to actual 14 without any changes to the test‽TestMapMutationReadWriteCount
. Why did read count go from expected 105 to 206 without any changes to the test‽DateTimeType
?