Closed rvagg closed 3 years ago
I went digging for our discussion notes (and merged some stale PRs) and found them here:
I've addressed the reviews so far and I think this is ready for landing. The reserved namespace rules are hard to be clear about but I've done the shuffle recommended by @vmx and separated things out a little more to hopefully make it easier to digest.
I went with the .0
rule for floats with no fractional component - although this is no help to JavaScript but Go can implement it.
I also have a working JavaScript version that I'm pretty much ready to release. It uses the same backend as the DAG-CBOR codec but with JSON semantics. We've lost some speed in the process but we get all the reserved namespace rules and all the strictness.
@warpfork care to review the changes?
As per a brief discussion today:
"bytes"
implemented @ https://github.com/ipld/go-ipld-prime/pull/166 but this was done without the multibase prefix as per the current version of the specSo either we should fix go-ipld-prime to match spec or we just bite the bullet and remove it and get it done everywhere—I can do this in JavaScript pretty quickly after merging this.
@warpfork @vmx @willscott (and anyone else) 👍 for just remove multibase prefix and 👎 for leave the spec alone darn it!.
My reasons for having the multibase prefix:
All that said, I don't have a strong opinion and can live with either way.
at first I couldn't get them making the same bytes and thought there must be a multibase incompatibility (the horror!) but it's []byte('deadbeef')
being deceptive cause that's actually a string, not a hex value in spite of the nice hex characters..
also, sorting, ipld-prime don't sort and that engages hostilities in @warpfork and is not fun for compatibility fixtures :facepalm: and I kind of don't want to have that argument over and over so will skip it for now.
But basically, it's not a big deal to do it in go-ipld-prime. It's also not a big deal to strip it in @ipld/dag-json
, but I'm also a soft -1 to changing the spec and leaving the prefix in there for reasons stated in this thread and restated by @vmx. But, it's not a big enough deal if someone else has particularly strong feelings and justification.
Arguments against inserting multibase in this bytes field, just to have them written:
fine, you appear to have a stronger opinion than either of us (although I'm not sure I'm convinced it's all that strong)
have pushed a new version that removes it and will do likewise in the JS impl
(PTAL & merge if :thumbsup:)
I've been working ok figuring out how to codify the rules that we discussed sometime last meeting .. during some meeting. I can't find the particular meeting by looking at the minutes in https://github.com/ipld/team-mgmt so I'm doing this from memory. But the main points were:
{"/":...}
namespace so that tokenizing parsers didn't have to buffer and unroll too much data but can quickly reject form without getting too deep.Remove the multibase prefix from the encoded bytes.After working through some of this, I've flipped on my opinion of the second point and would rather it stay as proper multibase, even if it's still strictly base64. It adds clarity and there's scope for using this for further validation if we want to. It's a single byte and I think it increases the clarity and reduces the number of edge cases. So I haven't included that change here, but discussion is welcome if you feel strongly!
The bit to pay most attention to below is "The Reserved Namespace" where I've tried to enumerate the rules. It turns out to be quite hard to explain it with crystal clarity, but that could be me so suggestions welcome!
Along with this, as exploration, I've been working on a version of JS DAG-JSON that does tokenized decoding and uses the same base encoder/decoder as the new JS DAG-CBOR codec, just by swapping out the backend. There's some consistency benefits here in having the same code paths and same affordances for applying data model strictness. Still uncertain whether I'll actually propose that we merge this, although I did uncover some bugs in the current implementation in the process, but the code is here: https://github.com/ipld/js-dag-json/compare/rvagg/cborg (see the innards of
DagJsonTokenizer
for the token parsing rules that arise from the reserved space rules).