Feedback to rfc8927 - Githubissues

moander commented 3 years ago

Not sure where to post feedback to the spec so I try here.

I just discovered JTD and think it's a good idea to have a spec designed to produce POCO objects in any language. After trying to write a simple schema builder in Vue I have a few questions/comments to the spec.

Why optionalProperties instead of an optional bool like the nullable option?

What is the psudo code to figure out if a form is primitive or complex type? I ended up with

if (properties || optionalProperties || discriminator || elements || items) // object
if (enum || type == "string" || type == "timestamp") // string

Why no int64 / uint64?
The uint8 is nice to have for buffers/byte arrays but I think int8, int16, and uint16 should be removed from the spec.

ucarion commented 3 years ago

Thanks for the feedback! It's very much appreciated. If you find typos or what seem like errors, please do let me know. These can be issued as errata to RFC 8927. Otherwise though, RFCs are immutable documents. So to actually take action on any of these suggestions would require creating a new RFC. This is sort of out of my hands, it's IETF procedure.

On (1), this is a reasonable suggestion. Ultimately, the spec is as-is because a optional keyword would only be valid/meaningful if the parent schema is of the "properties" form. This would be an extra thing that every implementation would need to check for, and users of JTD could be confused or misled into thinking optional works in other places. For instance, users might be confused as to what this schema means:

{ "elements": { "type": "string" }, "optional": true }

To your comparison with nullable, the difference is that nullable is valid almost everywhere -- the only place it's not is in the children of mapping. But optional would be invalid almost everywhere, so it seemed to make more sense to not introduce it.

On (2), by "complex" type you mean arrays/objects? Something else? What would you call primitive, and what would you call complex?

On (4), does this section in the RFC answer your question: https://tools.ietf.org/html/rfc8927#appendix-A.1 ?

On (5), basically there was relatively little downside to including these additional primitive types. I do take your point that they are less often used. int8 is useful for describing a lot of existing Java systems that use byte, for what it's worth. Generally speaking, the stakes are somewhat low here because for most use-cases, if you're supporting one or two of the integer types, you can usually support all of them.

moander commented 3 years ago

Thanks @ucarion. My current understanding is that an optional property means that it's optional to include it in the json. Does it serve another purpose? It was the examples in the RFC that made me question the benefits of having two places to configure props.

 {
   "properties": {
     "a": { "type": "string" },
     "b": { "type": "string" }
   },
   "optionalProperties": {
     "c": { "type": "string" },
     "d": { "type": "string" }
   }
 }

VS

 {
   "properties": {
     "a": { "type": "string" },
     "b": { "type": "string" }
     "c": { "type": "string", "optional": true },
     "d": { "type": "string", "optional": true }
   }
 }

I would argue that the latter one is less confusing. The user must understand what optional means in both scenarios but the second example will require less knowledge. Properties can be made optional without affecting the error indicators and there is no need for implementors to know how to handle duplicate/conflicting keys.

The above statements is based on my understanding of the optionalProperties so it may be totally wrong if it turns out that I have misunderstood what optional really means :)

(2) In this context I consider all types that are serialized to json object or array for complex. Date, int, string, guid, null and so on are primitive.

(4) My concern is that by not having the required number types people will make up their own solutions. JTD can define these types saying it will be serialized as json strings. They can be validated by parsers using regex and code generators can map the values to supported types in C#, Java and so on. For unsupported languages the code-gen will keep the original string values.

ucarion commented 3 years ago

Thanks @ucarion. My current understanding is that an optional property means that it's optional to include it in the json. Does it serve another purpose? It was the examples in the RFC that made me question the benefits of having two places to configure props.

That understanding is correct. What I was saying is that your alternative design proposal could add confusion. Would

{ "type": "string", "optional": true }

Be a valid schema, in and of itself? If it's not, then what are the new rules for what's a valid spec or not?

 {
   "properties": {
     "a": { "type": "string" },
     "b": { "type": "string" }
   },
   "optionalProperties": {
     "c": { "type": "string" },
     "d": { "type": "string" }
   }
 }
VS
 {
   "properties": {
     "a": { "type": "string" },
     "b": { "type": "string" }
     "c": { "type": "string", "optional": true },
     "d": { "type": "string", "optional": true }
   }
 }
I would argue that the latter one is less confusing. The user must understand what optional means in both scenarios but the second example will require less knowledge. Properties can be made optional without affecting the error indicators and there is no need for implementors to know how to handle duplicate/conflicting keys.

I agree that once you accept the added complexity for all other JTD forms, the properties form in particular is easier to understand and implement, and its error indicators are a bit less all over the place. The reason this design wasn't chosen is because of how optional affects the other forms.

Your design proposal, to be clear, is very reasonable. At the end of the day, this one was a very close call.

(2) In this context I consider all types that are serialized to json object or array for complex. Date, int, string, guid, null and so on are primitive.

I'm afraid I've lost you a bit. In your original question, you asked if a form is complex or primitive. By that, I assume you meant a JTD schema form. But in your clarification, you mention dates and GUIDs, which aren't JTD forms, they're concepts in various programming languages.

I suspect we're playing a bit of cat-and-mouse with the specific words here. Let me try to rephrase your question, let me know if this is equivalent to what you originally meant:

What is the pseudo-code to tell if a JTD schema form accepts data that is an array or object, versus one that accepts other kinds of data?

If that's what you're asking, then the answer is:

Any schema with nullable: true will accept null. Whether's that's "primitive" or "complex" is up to you.
The empty form accepts both complex (array and object) data, as well as primitive data (everything else in JSON)
The ref form accepts whatever the schema it refers to accepts, so it's sort of a wildcard. It's valid to have a self-referential schema, so you sort of get a weird case when trying to figure out what kinds of data that sort of schema accepts. For example: { "definitions": { "loop": { "ref": "loop" }}} is valid. What does schema at /definitions/loop accept? There's no good answer, it's sort of undecidable.
The type and enum forms accept "primitive" data. All enum schemas, and type schemas for type string and timestamp accept JSON strings. Type schemas for type boolean accept JSON true or false. All other values of type accept JSON numbers.
All other schema forms will either accept arrays or objects.

(4) My concern is that by not having the required number types people will make up their own solutions. JTD can define these types saying it will be serialized as json strings.

Problem is, there are a few different ways you can write a number as a JSON string. For instance, will "1e10" be a valid way to write a uint64? Or are we going to say that uint64 numbers have to be written within a JSON string using a grammar that is different from the grammar from ordinary JSON numbers?

Plus, it's not reasonably easy for a JavaScript implementation to do bounds checking on uint64 values anyway, since it doesn't have numbers of sufficient precision. If it's not easy to implement the spec in JavaScript, then the spec is problematic.

The RFC's answer to sending uint64s over the wire is: if you're writing it as a string in your JSON, then just say it's a string at the JSON Type Definition schema level.

They can be validated by parsers using regex and code generators can map the values to supported types in C#, Java and so on. For unsupported languages the code-gen will keep the original string values.

It's not really possible to validate a uint64's value range with a regex. I do agree that codegen could figure this out in some, but not all, languages. It's just not quite interoperable enough to be put in the specification. JavaScript's design is the limiting factor in the JTD RFC, just as it limits the I-JSON RFC.

ucarion commented 3 years ago

One further point on adding uint64: there aren't actually that many JSON implementations out there that support encoding 64-bit numbers as strings over the JSON wire.

xli commented 2 years ago

It's not really possible to validate a uint64's value range with a regex. I do agree that codegen could figure this out in some, but not all, languages. It's just not quite interoperable enough to be put in the specification. JavaScript's design is the limiting factor in the JTD RFC, just as it limits the I-JSON RFC. One further point on adding uint64: there aren't actually that many JSON implementations out there that support encoding 64-bit numbers as strings over the JSON wire.

Regardless how to implement it, it's a pain point using JSON / JSON schema for data type uint64. In blockchain domain, uint64 and uint128 are common data types.

xli commented 2 years ago

I think a specific example is jtd split out uint8, and other int types, which provides better codegen result.

jsontypedef / json-typedef-spec

Feedback to rfc8927 #8