kdl-org / kdl

the kdl document language specifications
https://kdl.dev
Other
1.12k stars 62 forks source link

Add support for NaN, +Infinity, and -Infinity #374

Closed yerke closed 7 months ago

yerke commented 7 months ago

Hello,

i know this was already discussed before in #121, but is it possible to consider again adding support for NaN, +Infitnity, -Infinity?

Protobuf standard essentially has to use non-standard JSON because they needed to be able represent all possible floating values, and those 3 values are not representable at all in standard JSON.

Since you are finalizing v2, I think it’s a good time to revisit this question.

Thanks!

zkat commented 7 months ago

I guess now that we have # for keywords, we could use it for special floats, so: #infinity, #neg-infinity, and #nan.

One weird bit is that we don't really specify right now what format numbers should be in. Using ieee float terms might be a break from that, but maybe it's ok.

@tabatkins @larsgw @IceDragon200 what do y'all think?

tabatkins commented 7 months ago

I'm reasonably in favor of this. In theory a host lang might not have these concepts, but IEEE 754 floats are fundamentally ubiquitous. And I mean, in theory a host lang might not have a value matching #null either.

Meanwhile, yeah, the fact that you can't represent these values in JSON is sometimes annoying.

I think we should do it.

IceDragon200 commented 7 months ago

I see no problem with that you could also just do:

But do we also disallow these as identifiers now?

zkat commented 7 months ago

@IceDragon200 # is illegal in identifiers now, so anything with a # is disallowed unless it's one of the pre-specified keywords. We're pretty free to add to that namespace, keeping in mind that I also want to use it for KQL pseudo-classes too.

zkat commented 7 months ago

Some quick bikeshedding, since it looks like we're generally agreed about adding this. Should we do:

I'm kinda meh about having both #infinity and #+infinity, since I feel like they're kinda redundant and we should just pick one, but I'm not strongly opposed to having both.

IceDragon200 commented 7 months ago

Ah I meant more of disallowing their non-prefixed variants, sorry about the lack of words there

So basically these:

Would no longer be allowed as regular idents (since we gave true, false and null the same treatment).

That aside I didn't even thinking of shortening it, I like:

zkat commented 7 months ago

ah I see what you mean.

Yeah, I don't mind banning those too.

tabatkins commented 7 months ago

I don't think we necessarily need to exclude the ident-string variants of them. It kinda defeats the purpose of using # as a safe namespacing mechanism if we still have to deal with compat in making previously-valid ident strings illegal. The big reason we exclude true/false/null is to minimize upgrade pain from v1 and to avoid the pitfall of those being the ways to spell those values in most other data langs so it's easy to accidentally misspell.

Note as well that v1 already made it illegal to use true/false/null as a node name or property name, so there were no idents like that in the first place. But v1 docs could have nodes or properties named nan or inf. I suspect compat pain is minimal, but still.

Ultimately my opposition to banning them from ident strings is weak; if others feel it's a good idea I won't fight it. But I do want to push back against it if there's no strong argument in favor.


Fwiw, I like the names #inf and #-inf just because they're so nice and short (and I think the shortness makes the - more noticeable in the negative one). I'd be fine with #infinity and #-infinity too, I just prefer them slightly less.

Mildly opposed to adding a #+infinity; these are keywords, not numbers, so we don't need to follow all the numeric variants. Better to spell each thing in one way unless there's a great reason to allow multiple. (Like, it's okay that you can spell a string in up to three different ways; it would be bad to force all single-word strings to use ident-string syntax.)

zkat commented 7 months ago

While I agree that the main reason for the true/false/null thing is is to prevent footgunning on v1 updates, I think it's still worth banning the numeric keywords if only to prevent regular footgunning when people just happen to forget the #. I can see someone writing foo infinity and then getting very surprised that they're getting a string. idk. That does seem like a fairly rare string to be writing into documents, though, and I very much imagine libraries to fail appropriately in these cases.

zkat commented 7 months ago

Alright, this is added to v2 now.

yerke commented 7 months ago

@zkat Thanks for making the changes so quickly! I think you might want to update Full Grammar section of SPEC.md as well.

zkat commented 7 months ago

haha dang. I tried to go so fast that I forgot to do that. Fixed. :) Thanks