Open Vap0r1ze opened 3 months ago
As I recall the intent was that those characters are invalid nostr characters, so we don't need encodings for them.
Unicode escape codes are an aberration from a distant past that should be forgotten.
As long as you're not doing anything super weird this problem won't happen and most default JSON encoders will do the right thing.
After looking into this, there's more than just 0x00-0x1F
that this "problem" exists for. That section of NIP-01 is essentially trying to restate the ECMAScript spec's QuoteJSONString
(how JSON.stringify
handles strings), to try an ensure determinism. There are two more ranges that QuoteJSONString
uses \uXXXX
escapes for, but those doesn't matter much since they only exist to cope with how JavaScript strings don't need to be valid in any encoding.
I think to prevent headache for someone who decides to implement their own JSON (de)serializer, NIP-01 could:
event.content
(like those inside event.tags
)QuoteJSONString
for deterministic string serialization.As much as I would like the ability to send raw control codes, given that terminals are very much not "a distant past". I do think that option 1 is more ideal so that the string values are ensured to be valid utf-8, making compliant parsing easy for both JSON.parse
users (no encoding required) and serde_json
users (must be valid UTF-8 since it uses std::string::String
)
Very good points. I agree.
The base protocol (
NIP-01
) draft currently says this:It says "all other characters must be included verbatim", but the JSON standard (see Section 9 "String") requires that "the control characters U+0000 to U+001F" are escaped using
\uXXXX
unicode escapes.An example of a
"content"
value that is valid in NIP-01 but invalid in JSON:At this point it's probably not feasible to change the draft to use valid JSON, but the draft should probably mention that you must deviate from the JSON standard to produce NIP-01 compliant event IDs.