ipld / specs

Content-addressed, authenticated, immutable data structures
Other
592 stars 108 forks source link

Specify encoding of numbers in DAG-CBOR #80

Open vmx opened 5 years ago

vmx commented 5 years ago

Numbers can be represented in different ways in CBOR. We should specify how they should be encoded so that all implementations will result with the same binary representation, hence the same hash.

For integers we can follow the Canonical CBOR as specified in the CBOR RFC and use the smallest possible representation.

For floats I'd also follow the recommendation from the Canonical CBOR and represent them as small as possible.

Then there's the last case, floats without a fraction, which actually triggered this issue (see https://github.com/ipld/interface-ipld-format/issues/9#issuecomment-431029329).

I would do it the js-ipfs way and use an integer if possible, which aligns well with the Converting from JSON to CBOR section of the CBOR RFC. I'm not sure if "without fractional parts" only mean numbers without a decimal point or also include numbers which have only zeros after the decimal point. I would store numbers with a decimal point and zeros only as integer.

Stebalien commented 5 years ago

Based on https://github.com/ipld/specs/blob/master/IPLD-Data-Model-v1.md, we should be using the canonical encoding. The data model doesn't distinguish between different float/int types at the encoding level.

mikeal commented 5 years ago

Can we get this moved/assigned to the proper implementation repo in Go?

vmx commented 5 years ago

@mikeal Wouldn't it make sense to first spec this out in https://github.com/ipld/specs/blob/2a5c3575f28b9393998ec121b6c8b239df4df9f2/block-layer/codecs/DAG-CBOR.md first? People reading the dag-cbor spec should be able to implement it easily in a compatible way,

rvagg commented 5 years ago

@vmx thoughts on what's needed to make it a more fully formed spec?

vmx commented 5 years ago

@rvagg I would describe how numbers need to be encoded, e.g. that floats without decimals (or only zeros as decimals) are encoded as integers (or not, depending on what we decide to do).

I'll work on this next week.