ipld / specs

Content-addressed, authenticated, immutable data structures
Other
592 stars 108 forks source link

Best practices for IEEE-754 non-fintite numbers (NaN, Infinity) #346

Open vmx opened 3 years ago

vmx commented 3 years ago

Do we have a FAQ-like/best practices document? I'd like to see the question: "How do I represent IEEE-754 non-finite numbers like Nan or Infinity in IPLD?

I guess the answer is something along the line of:

This is application specific. If your application is using decimal numbers, it almost certainly is using IEEE-754 floating point arithmetics. You might not even put a thought into things like Infinity or NaN and hence don't really support it. Then the right approach can be to error if you encounter those values. If you don't want to fail hard on those values, you could use null for them (if you're using IPLD Schemas, use the nullable type modifier). If you want to preserve those values, use strings (if you're using IPLD Schemas, use an enum e.g.:

type Ieee754NonFinite enum {
  | PositiveZero ("+0")
  | NegativeZero  ("-0")
  | SignallingNan ("sNaN")
  | QuietNan  ("qNaN")
  | PositiveInfinity ("+Inf")
  | NegativeInfinity  ("-Inf")
}
rvagg commented 3 years ago

Yeah, this is good. I started writing something just now in #344 to cover some of this but it felt like there's already enough in that PR for now and I'm not even sure data-model.md is the best place for this (I don't have a better suggestion though).

My sentiment went something like this:

NaN and the Infinity values are just special programmatic tokens and should be treated as such in serialized data. When you serialize a token with special meaning you do it in a clear and consistent way (which IEEE 754 doesn't do). Options for encoding include strings, bytes or integers where the enumeration of possible forms is limited and pre-defined. e.g. "Infinity", "NaN", etc. or simply 1 for Infinity, 2 for NaN, etc. This can be expressed with an IPLD Schema enum for clarity.