json-schema-org / json-schema-spec

The JSON Schema specification
http://json-schema.org/
Other
3.85k stars 267 forks source link

Split `items` into `items` and `tupleItems` #864

Closed jdesrosiers closed 4 years ago

jdesrosiers commented 4 years ago

The items keyword has an object form and an array form. The object form is used for constraining the items in an array while the array form is for constraining an array that is used like a tuple. This is something that trips up people new to JSON Schema.

It also causes confusion with the additionalItems keyword. additionalItems is only defined for use with the array form of items, but I often see people throwing in "additionalItems": false when they are using the object form. I'm not sure what they think that means, but it's not uncommon to see.

Proposal

Pros

Cons

handrews commented 4 years ago

This was previously discussed in #209 specifically starting at https://github.com/json-schema-org/json-schema-spec/issues/209#issuecomment-269651878

I'm obviously not personally dead-set against it as I brought it up last time, but it did not gather support.

This would also be something where I'd want input from the OpenAPI folks (@webron @darrelmiller @mikeralphson) because it impacts them as of OAS 3.1. Since OAS 3.0 doesn't allow tuple items to begin with, it would probably be good to put this into 2020-03 if we want it so that the change doesn't show up in OAS 3 at all. On the other hand, the OAS folks have been skeptical of additional changes at this stage.

handrews commented 4 years ago

Paging @darrelmiller again as I misspelled his id at first in the previous comment and I'm not sure GitHub emails people if you tag them in a comment edit

jdesrosiers commented 4 years ago

I didn't realize it had come up before. I'll read through that issue.

Sounds like an ideal change for better OAS integration. That way they don't have to redefine items, they just need drop support for tupleItems.

darrelmiller commented 4 years ago

As someone who has never seen Items used as an array, I would not be particularly concerned about a change like this. Obviously getting this in before OAS 3.1 takes a dependency on the updated JSON Schema would be a good thing. Having said this, I'm not a heavy enough JSON Schema user to have too much weight in this conversation.

webron commented 4 years ago

While generally not a fan of creating more keywords, @handrews is correct that in this case it looks like this is not going to break anything.

I imagine that if both items and tupleItems are defined, they both need to be validated.

Not a huge fan of shoving tupleto all the new keywords as it just makes them longer, but I can't think of a better way of doing it.

awwright commented 4 years ago

I would think if we want to remain analogous to properties/additionalProperties, we should deprecate the object form of "items". This seems more straightforward to me:

Is this any less sufficient?

MikeRalphson commented 4 years ago

Possibly use tuples alone in place of tupleItems in the keywords, but otherwise I'm neutral on this. (Not dropping / deprecating the object form of items, that seems like it would break OAS 3.0 to 3.1 compatibility).

darrelmiller commented 4 years ago

@awwright I'm not sure that the symmetry of that approach outweighs breaking 90% of the current usages of items.

handrews commented 4 years ago

My preference would be to add tupleItems that behaves exactly like the array form and leave the others unchanged.

In my view, as long as we keep "Items" in tupleItems, the names of the other keywords work fine, and I prefer minimizing the thrash on those keyword names. I'm not dead set on this but that is the direction in which I would lean.

Regarding @awwright 's suggestion, it would be more symmetric but I think I'm with @darrelmiller on the fact that items with a schema is the only form of these keywords that is really widely used, and it's the only form that has long been supported in OAS. The idea here is to maybe break some obscure compatibility in JSON Schema, but not break compatibility for OAS.

@Relequestual @gregsdennis @Julian @johandorland thoughts on this?

jdesrosiers commented 4 years ago

I prefer minimizing the thrash on those keyword names

The keyword thrash is unfortunate, but if we don't change it, people are going to continue to use additionalItems incorrectly.

karenetheridge commented 4 years ago

I like the idea of splitting the keywords, but "tuple" is unintuitive. I don't want to bikeshed the wording, but what about "itemElements" for the array form?

notEthan commented 4 years ago

splitting items is good. it makes more sense to me as a schema author and implementer of schema tooling to have separate keywords for the significantly different behaviors.

adding tupleItems makes sense to me, keeping items as one schema for all array items. I do see @awwright's point that array items + schema additionalItems is a closer parallel to the handling of properties/additionalProperties. but in practice, people mostly use arrays of the same sort of item, rather than tuples, and having this more common use case on the items keyword makes more intuitive sense to me, and also will break everybody's schemas less.

for additionalItems, I'm not certain the benefit of a rename; tupleItems + additionalItems + unevaluatedItems seems sufficient to me. it's a little weird that the plain items has no interaction with any of these other Items suffixed keywords, but I think it's okay.

I'm not as fond of the itemElements name suggestion - 'items' and 'elements' seem like synonyms to me. tuple is a good description of the data structure the array form expresses. as for tuples, I'd say keeping the Items suffix helps keep it in line with additionalItems (I don't think additionalTuples makes sense; not sure what else you might call items beyond tuples).

gregsdennis commented 4 years ago

I'm with @karenetheridge in that "tuple" is unintuitive. If you don't have the history of why the array format was created, it makes no sense. I think there are use cases for this form outside of tuples, and we shouldn't pigeon-hole the keyword to a single use case.

If itemElements is unsuitable, something else should be suggested. Maybe sequencedItems.

handrews commented 4 years ago

Before we rathole too much on names, is anyone objecting to at least splitting the tuple (by whatever name) form of items into its own keyword? So far it seems impressively uncontroversial.

awwright commented 4 years ago

It seems reasonable to me to have separate keywords for this job.

The exact behavior, maybe less so, however.

It seems like we might want three different keywords:

(1) Schema to apply to every item of an instance array — that can exist concurrently with (2), and may include keywords factored out of each schema in (2)

(2) Array of schemas to apply to respective items in the instance array

(3) Schema to apply to items in the instance array beyond the end of (2)

Now, what do we name each of the walls of this bike shed? It sounds to me like English is missing names for "each item in a homogeneous array" and "respective items in a heterogeneous array" and instead we're just stuck with "items"

handrews commented 4 years ago

fixedItems? positionItems? I actually like tupleItems but I see the reasons for not using it. It's probably just that I say "tuple-form items" a lot when talking about this so obviously it makes sense to me 😛

notEthan commented 4 years ago

I do like tupleItems still - tuple seems to me to perfectly describe a fixed-length array of dissimilar items. but it's not perfect. when used with additionalItems the described data structure isn't really a tuple any more (this falls under @gregsdennis's comment "I think there are use cases for this form outside of tuples, and we shouldn't pigeon-hole the keyword to a single use case").

I think positionalItems resonates best for me so far. (you suggested positionItems @handrews but the adjective feels more correct to me.)

Relequestual commented 4 years ago

Affirmative on this change. I'd say we go for tupleItems, additionalTupleItems, and unevaluatedTupleItems.

I may be wrong but I didn't see anyone suggest that, and it seems like the logical naming method to me.

But, I'm open to positionalItemsassuming additional/unevaluated... positionalItems also.

handrews commented 4 years ago

Why change additionalItems and unevauatedItems? They're nicely symmetrical with additionalProperties. They're also already really long keywords.

jdesrosiers commented 4 years ago

@Relequestual

I may be wrong but I didn't see anyone suggest that

That was the original proposal, so I agree completely :wink:

I recognize that not everyone works in languages that have a concept of a tuple and therefore find it less intuitive to think in those terms. It will be a lot more intuitive to an Elixir developer than it would be to a Java developer. However, I think it's the best description of the data structure. "Positional" is not bad, but I'd rather not invent a new term for something that already has a well defined term.

handrews commented 4 years ago

@notEthan I almost wrote positionalItems instead.

Also, I think positionalItems + additionalItems == items works nicely.

I really don't want to make additionalItems and unevaluatedItems longer than they are now. I'll prefer any name for the tuple/positional form of items that makes people more comfortable leaving those names unchanged.

jdesrosiers commented 4 years ago

@handrews

Why change additionalItems and unevauatedItems?

additionalItems only applies to the array form of items. If we split out the the array form to tupleItems, then additionalItems only applies to tupleItems, not items. additionalItems sounds like it applies to items. I often see people use additionalItems with the object form of items. If we don't change additionalItems to match the keyword it works with, it makes it even more confusing.

I definitely agree that it makes those name uncomfortably long, but I think a better name is worth it for our users.

handrews commented 4 years ago

OK, let's sort this out. There are several conflicting goals here:

  1. Stop having a keyword with two forms
  2. Minimize change for existing users
  3. Maximize analogy with the object keywords
  4. Avoid the ability to write the same thing with two keywords
  5. Minimize confusion for new users

In all options, unevaluatedItems (or its renamed equivalent) is simply the sees-through-in-place-applicators version of additionalItems (or its renamed equivalent), so I'm not going to mention it further.

Currently we have the following (which is not very analogous to the object keywords):

The possible keyword behaviors are (with the current keywords listed where relevant)

I included the groups of children keywords here for completeness, but we won't discuss them further here. Nor will we discuss unevaluated* which will always be the dynamic sees-through-in-place-applicators version of additional*, under whatever names we end up with.


So let's look at options.

To be perfectly analogous to the object keywords, we'd need to both split the current items and either add a unconditional all-properties keyword (so objects match arrays), or remove the unconditional all-items keyword (so arrays match objects).

Let's look at removing the unconditional all-items keyword, as it produces a minimal set of keywords with no duplicate behaviors:

This handles goals 1, 3, 4, and arguably 5. If I were starting JSON Schema over today, this is what I would do. But, we're not starting JSON Schema over, and this violates goal 2 (minimize change for existing users). The change for existing users would impact the vast majority of array schemas, and require OAS to go to OAS 4 rather than OAS 3.1 for compatibility reasons.

We could come closer to meeting goal 2 at the expense of goal 5 (minimize confusion for new users) by renaming additionalItems to items:

This retains enough compatibility for OAS 3.1, as they do not use additionalItems or the tuple form of items at all. But the asymmetry with objects is pretty glaring.

We could try to rationalize the whole thing into a minimal set of consistent, reasonably named keywords:

but that would break what is probably the most commonly used applicator keyword in all of JSON Schema, properties, in a horribly confusing way for existing users. And require OAS 4. Or possibly OAS 400. Is there a Semantic Versioning guideline for "we reversed your keywords"?

We could give up on goal 4 (no apparent duplicate keywords- note that using additionalProperties on its own doesn't actually do anything so in practice they are not really duplicates):

This is the absolute minimal impact to existing users (goal 2), and is reasonably decent for new users (goal 5), except that violating goal 4 does leave some confusion around for new users because the apparent duplication is weird.


Nothing will be ideal. NOTHING. We are a project with a long history and a large user base, and that limits what we can do. The fact that we're still nominally a "draft" ignores the reality of the deployment of JSON Schema-based solutions. Pissing off our user base throws away the most valuable asset of the project: the concrete evidence that it is actually useful, as demonstrated by people using it.

My preference would be:

In this approach:

This definitely solves goals 1 and 4, and strikes a reasonable balance between 2 and 5. It mostly gives up on goal 3 (analogy with objects), although it does improve that situation in one way as there is no longer an "extra" array keyword compared to the object keywords. It makes it worse in another way, by breaking the additionalProperties/additionalItems symmetry.

In an ideal world, we'd solve all five goals, but as noted earlier, we are not redesigning JSON Schema from the ground up. We have legacy commitments, and I think the right thing to do is to take the legacy situation into account, work on our documentation and education to mitigate it, and solve all of the other problems as best we can.

Note that in this proposal, the name prefixItems is critically important. Neither tupleItems nor positionalItems provides the necessary implications that items will apply to the items beyond the last position. Only prefixItems does that, which is the only way this becomes reasonably intuitive.

awwright commented 4 years ago
  • Stop having a keyword with two forms
  • Minimize change for existing users
  • Maximize analogy with the object keywords
  • Avoid the ability to write the same thing with two keywords
  • Minimize confusion for new users

Agree on every front.

Another idea: we add "itemValues" and "propertyValues" — these cover the values of every item, whether they're also specified in "items" and "properties".

Then, we make "items" indexed-only (array form only), the same way "properties" is object-only (except as necessary for reverse compatibility—most implementations will want to preserve reverse compatibility, of course). Alternatively, we say something like "As an authoring convenience and for historical reasons, if neither additionalItems nor itemValues is being used, items may be used with an object as a shorthand for itemValues".

Optionally, we permit "additionalItems" to work even when "items" is absent—the same way "additionalProperties" works without "properties".

And note, this is symmetrical with "propertyNames" — this covers every property's value and items' value, the same way "propertyNames" covers every property's name.

And it makes it symmetrical with properties/additionalProperties.

afaict, this is a win all around.

Relequestual commented 4 years ago

I feel @handrews suggestion of prefixItems makes the most sense.

@awwright are you suggestion we have prefixItems, items, and additionalItems?

As far as I can work out, @handrews proposal with prefixItems includes removing additionalItems, given that items with an array value is now prefixItems and additionalItems now effectivly becomes items but limited to an object value.

Am I reading this right?

jdesrosiers commented 4 years ago

I thought it might help to see an example of each proposals in one place. I've associated a reaction with each proposal if people want to show their support (this is not a vote). If there are other proposals I should add, let me know and I'll add it.

  1. Tuple @jdesrosiers :tada:
    {
      "type": "array",
      "tupleItems": [{ "type": "string" }, { "type": "number" }],
      "additionalTupleItems": false
    }
  2. Positional @notEthan :heart:
    {
      "type": "array",
      "positionalItems": [{ "type": "string" }, { "type": "number" }],
      "additionalItems": false
    }
  3. Prefix @handrews :rocket:
    {
      "type": "array",
      "prefixItems": [{ "type": "string" }, { "type": "number" }],
      "items": false
    }
  4. Values @awwright :eyes:
    {
      "type": "array",
      "items": [{ "type": "string" }, { "type": "number" }],
      "additionalItems": false
    }
jdesrosiers commented 4 years ago

The attempts at symmetry are quite clever, and I find that very tempting. But, ultimately, it feels a little forced. I still think the "Tuples" proposal is the most intuitive and has the least impact on existing schema developers.

With "Positional", I'm against not modifying addtionalItems not match positionalItems. The name "positional" is not bad, but tuple is better. It's best not to invent a new term for something that already has an accepted term.

"Prefix" is my second favorite. It's a very clever way to get something close to symmetry. But, ultimately, I think it's less intuitive than "Tuple" and I think that outweighs the desire for symmetry. It also changes the behavior of items which could cause confusion.

Despite also being very clever, I can't support "Values". I think allowing items to have different forms depending on what other keywords are present is not a good idea. I'm also not a fan of items taking the array form. It's probably fair to say that more than 99% of the time items is used with the object form. If we split tupleItems out of items, less than one percent of schemas would be effected, but spiting itemValues out of items means almost every place items is used would have to change.

gregsdennis commented 4 years ago

The name "positional" is not bad, but tuple is better.

My argument against this is that tuples aren't the only use case. It may have started that way, but the usage has evolved, like using JSON Schema not only for validation but for form generation.

awwright commented 4 years ago

Despite also being very clever, I can't support "Values".

I get the feeling it's not a bad idea per se it just doesn't solve this issue well.

It may still be worth looking into in addition, I think.

I think allowing items to have different forms depending on what other keywords are present is not a good idea. I'm also not a fan of items taking the array form. It's probably fair to say that more than 99% of the time items is used with the object form. If we split tupleItems out of items, less than one percent of schemas would be effected, but spiting itemValues out of items means almost every place items is used would have to change.

I was also going to suggest something like

As an authoring convenience and for historical reasons, "items" may take an object as a shorthand for "itemValues" if neither "itemValues" nor "additionalItems" are specified.

As for array-form "items", let me throw more ideas out:

handrews commented 4 years ago

@Relequestual yes, you are correct.

@jdesrosiers

"Prefix" is my second favorite. It's a very clever way to get something close to symmetry. But, ultimately, I think it's less intuitive than "Tuple" and I think that outweighs the desire for symmetry.

prefixItems is not about symmetry, it's about naming it around the behavior. prefixItems provides schemas for a prefix of the items in the array, after which items is used. It's not symmetrical with objects at all. Given that others such as @gregsdennis don't find tupleItems intuitive, I'm afraid your assertion that that is the best name is not clearly supported.

It also changes the behavior of items which could cause confusion.

Sort-of, but it doesn't change the behavior of items in the case where it is a single schema and prefixItems is absent. Since prefixItems doesn't currently exist, and you can't have both forms of items in the same JSON Schema object anyway, that means that all existing uses of items behave exactly the same as they always did.

As for the other keywords, to migrate a schema, just rename array-form items to prefixItems and additionalItems to items. If additionalItems was present alongside items it is ignored so in that case `additionalItems should just be dropped.

@awwright if we were starting from scratch, what you suggest with *Values would make lots of sense. But if I understand your proposal correctly it would be:

This provides three different possible ways to provide a schema for all items in an array, and leaves items with two different forms. It's symmetric, but with all of the compatibility things it's even more confusing than what we have now. Let me know if I'm getting this wrong, but I don't see how this really solves the problem, which is to simplify things.


Let me emphasize that we have consensus on only two things:

  1. items should keep its more common form, and have the other form split out
  2. having overlapping behavior between items and additionalItems is undesirable, and our current workaround of declaring that additionalItems is ignored when schema-form items is present is unsatisfactory

My proposal is the only one that preserves the existing behavior of schema-form items and eliminates additionalItems. Can we agree that that is the best option, regardless of the name of the array-form keyword? If so, then we can keep arguing over the name for a bit, but I'd like to get the number and behavior of the keywords nailed down first.

awwright commented 4 years ago

@handrews Are we trying to simplify things, or not break things?


I think we mostly agree on the behavior. The only behavior differences are related to naming to maximize reverse compatibility (like my suggestion to allow an "authoring convenience" form of "items").

Here's my new suggestion:

Note we also relax the limitation that "additionalItems" is ignored if "items" is missing, because that doesn't make any sense anymore.

handrews commented 4 years ago

@awwright we're trying to simplify things and not break broadly used things

I don't see much need to preserve additionalItems as it is for two reasons:

  1. It's only possible to use it with tuple items, so every place that it is used will already need to be changed to rename the tuple-form items to whatever we rename it to. Once your changing one keyword, changing two in the exact same object is not significantly more disruptive.
  2. tuple-form items and additionalItems are rarely used in the first place, and have always been forbidden in OpenAPI in particular. That's why we think it's OK to do any of this at all.
awwright commented 4 years ago

@handrews Sure, but we also have to convince users: "The array form of items is now called sequenceItems" —done, that's it, that's all there is. If it was much different I might think it's a new feature.

gregsdennis commented 4 years ago

You know, watching you two argue for the same thing is quite amusing.

You both want three keywords:

The only difference is that @handrews asserted that somethingItems should be processed before items, to which no one said anything. Honestly I don't think it makes a difference since items will hit all elements anyway.

It seems to me that we only need to agree on a name. My votes are for sequence or index/indexed.

Symmetry with properties isn't important to me.

handrews commented 4 years ago

@gregsdennis no, my proposal only needs two keywords, prefixItems and items. There is no longer need for additionalItems because items fills its role as well as the current schema-form items.

The only difference is that @handrews asserted that somethingItems should be processed before items, to which no one said anything.

That's because items takes on the behavior of additionalItems, which means we no longer have two keywords that appear to be able to do the same thing. We don't need the extra keyword.

gregsdennis commented 4 years ago

I still think there's a use case for having somethingItems alongside items where items validates all elements.

If I want all items have a foo property, but the first two have additional requirements, then I could do this:

{
  "somethingItems": [
    {
      "properties": {
        "foo": { ... },
        "bar": { ... }
      }
    },
    {
      "properties": {
        "foo": { ... },
        "baz": { ... }
      }
    }
  ],
  "additionalItems": {
    "properties": {
      "foo": { ... }
    }
  }
}

or I could do this:

{
  "somethingItems": [
    {
      "properties": {
        "bar": { ... }
      }
    },
    {
      "properties": {
        "baz": { ... }
      }
    }
  ],
  "items": {
    "properties": {
      "foo": { ... }
    }
  }
}

I think the second is simpler. If we change items to work like additionalItems currently does, then the second isn't possible, and authors would be forced to do the first ($ref can help with the duplication, but doesn't really reduce the complexity).

Additionally, that's how schema-form items works currently.


The other option that I haven't seen here is treating this like we did dependencies. We deprecate items and go with somethingItems/additionalItems and allItems.

  1. The new keywords are explicit in what they do
  2. items remains in its current state, supported but deprecated
  3. Stuff like the above is supported.
gregsdennis commented 4 years ago

As a side note, this example also illustrates non-tuple uses for array-form items as well as some small degree of polymorphism in the elements.

handrews commented 4 years ago

@gregsdennis Or you could do:

{
  "somethingItems": [
    {
      "properties": {
        "bar": { ... }
      }
    },
    {
      "properties": {
        "baz": { ... }
      }
    }
  ],
  "allOf": [
    "items": {
      "properties": {
        "foo": { ... }
      }
    }
  ]
}

Which is exactly analogous to what you'd do today:

{
  "items": [
    {
      "properties": {
        "bar": { ... }
      }
    },
    {
      "properties": {
        "baz": { ... }
      }
    }
  ],
  "allOf": [
    "items": {
      "properties": {
        "foo": { ... }
      }
    }
  ]
}

or what you would do for objects:

{
  "properties": {
    "br": {
      "properties": {
        "bar": { ... }
      }
    },
    "bz": {
      "properties": {
        "baz": { ... }
      }
    }
  },
  "allOf": [
    "additionalProperties": {
      "properties": {
        "foo": { ... }
      }
    }
  ]
}
notEthan commented 4 years ago

prefixItems / additionalItems

I do quite like the simplicity of @handrews' 2-keyword proposal. I would be behind it, but the naming throws me way off. it's all right when there's something in items, and the fixed-length prefix is actually a prefix.

{
  "prefixItems": [{"title": "header row"}],
  "items": {"title": "data row"}
}

but for a tuple

{
  "prefixItems": [{"title": "one"}, {"title": "two"}],
  "items": false
}

that doesn't read right to me. there's two items which are a prefix for ... ? and there are no items. it doesn't make sense to read, to me, without the rest of the items being called additionalItems. but then we've left items behind and broken everything.

so I'm still in favor of items being all items, and (positional|fixed|tuple|prefix|corresponding)Items + additionalItems also being all items.

handrews commented 4 years ago

@notEthan there are a lot of things that don't make perfect intuitive sense in JSON Schema. People find the fact that both items and additionalItems exist confusing. We can't make everyone happy. "items": false is not such a huge use case, and is a hell of a lot easier to explain than "additionalItems is ignored when prefixItems is not present".

We're not going to make everyone happy.

@jdesrosiers has not been able to get consensus that tupleItems is broadly preferable, and there is an actual rationale behind prefixItems that helps intuition more than the other names, so I'm still sticking with it.

@Relequestual can you make a call here? I spoke with @webron and he's in favor of the 2-keyword items + prefixItems (we can give him until monday to comment if he has changed his mind :-)

awwright commented 4 years ago

Ok, so to summarize the two different behaviors we're considering:

My proposed solution is to split the keywords into three (what I understand the original issue to be calling for), so we have one keyword for the object form of "items", and two keywords for tuples: the array form of "items" and "additionalItems". i.e. the 3-keyword solution.

It seems the other solution is the 2-keyword solution to re-combine the behaviors of the existing two keywords, so that additionalItems always works (named as "items" which is how "items" works right now anyways), that optionally may be "prefixed" by an array of schemas. (This seems more in line with how "properties"/"additionalProperties" works now.)

Am I summarizing this right?

handrews commented 4 years ago

@awwright yes, that is essentially correct. The 3 vs 2 aspect did appear in the original comment of this issue in the form of:

but I often see people throwing in "additionalItems": false when they are using the object form. I'm not sure what they think that means, but it's not uncommon to see.

observing that the overlapping nature of object-items and additionalItems is confusing. I don't recall if it's mentioned in this issue, but somewhere it's also come up more than once that people try to just use additionalItems when they mean object-form items, which is why there's that language in the spec that additionalItems is outright ignored if items is not present, or if it is present and an object. It's also been noted that implementations don't all get that right.

It's objectively unnecessary to have three keywords for this, the only question is whether it's worth preserving 3 vs 2 because historically items behaved like two different keywords, so we've kind of implicitly had 3.

jdesrosiers commented 4 years ago

@handrews

@jdesrosiers has not been able to get consensus that tupleItems is broadly preferable

True, but why single out my proposal? This is clearly true of ALL alternate proposals including yours.

, and there is an actual rationale behind prefixItems

This is insulting and disingenuous. All of the proposals given have a well considered rationale. Just because you like yours better doesn't mean the others are not equally well thought out. It's disrespectful an frankly bullying behavior to suggest otherwise.

that helps intuition more than the other names

That's your opinion. You've had multiple people comment that prefixItems isn't as intuitive as you think it is. I can only hope that you take those comments as seriously as similar comments about other proposals.

, so I'm still sticking with it.

That's fine. As per usual, I'm offering my opinions as: take it, leave it, modify it, it's up to you. (Edit: I removed a comment I feel was out of line. Sorry)

it's also come up more than once that people try to just use additionalItems when they mean object-form items

I've been following the jsonshema tag on StackOverflow for the last five years and have been the top answerer for years, so I like to think I have a pretty good understanding of what noobs are getting wrong. I've never seen anyone make this mistake. If I've never encountered this, I don't think this is a problem that needs solving. I honestly don't see why it's a problem anyway. Lots of keywords in JSON Schema have overlapping functionality including enum/const, patternProperties/additionalProperties, and many other examples.

The problem I do see frequently with noobs is using additionalProperties with object-form-items. Even the spec team gets tripped up by this one sometimes. That's the problem needs solving.

gregsdennis commented 4 years ago

Lots of keywords in JSON Schema have overlapping functionality

The overlapping functionality isn't a concern so much as having two forms of items that function differently.

Personally, I prefer sequenceItems.

"Sequence" suggests the functionality of the keyword better and in a more general way.

"Tuple" refers specifically to the original use case of the array form, but there are other use cases.

While "prefix" suggests "these come before the rest," it doesn't convey the idea that they are ordered in any way.

jdesrosiers commented 4 years ago

The overlapping functionality isn't a concern so much as having two forms of items that function differently.

Agreed. I often see people confused because they chose the wrong form. I forgot the mention that problem when I wrote up the issue.

I won't argue for "tuple" anymore. Clearly no one likes that term. I can live with "sequence" or "positional". But, I'm still a fan of having separate all-items and additional-something-items keywords. There's some overlap in functionality, but I'd rather have that focused keyword that I know won't be influenced by another keyword. I could get what I want by wrapping the additionalItems (of whatever name) in an allOf, but I shouldn't have to jump through such hoops to get that safety.

handrews commented 4 years ago

@jdesrosiers that really was not meant in as hostile of a way as you took it, I'm sorry that I didn't calibrate it better.

Why single yours out? Because you keep advocating for it while no one else seems to have particularly strong opinions. I did not mean to imply that you had no real thought behind it or anything of that sort. A better way would have been for me to point out that prefixItems is intended to provide an intuition for the role it and items play with respect to each other, while all of the other proposals are about what kind of thing is described (is it a tuple, or a sequence, or whatever).

I think it is important to convey usage rather than data types, especially because there is not a consensus on the data type that is most desirable (neither you nor @gregsdennis seem likely to budge on this). positionalItems is a little more functional, but does not help explain the relationship to items, really. prefixItems does.

@gregsdennis there are no words that will perfectly express everything. The fact that the value is a list and that the keyword after all will be documented should cover the ordered-ness of it.

At this point it looks like (based on comments and on offlist conversations), here is the current support. Note that this is not intended to imply a vote or that there will be a majority-rule vote. OpenAPI's opinion carries additional weight, for one thing. But this is just to summarize what seems to be the two viable options:

I know that @Relequestual and @webron agree with my rationale on the 2-keyword proposal. I still feel like the 2-keyword version is the least ambiguous, and am bothered that the 3-keyword proposal leaves the confusing items vs additionalItems problem in place, but let's give everyone one more chance to weigh in. I will attempt to get @Relequestual and @webron to actually comment again instead of just talking to me on slack. AHEM.

jdesrosiers commented 4 years ago

@handrews Thanks for your clarifications. This feels like a more productive discussion already. I'll write up a final summary of my thoughts so they are all in one place then I'll leave it up to the spec team to make a decision.

OpenAPI's opinion carries additional weight

I totally get why that is, but in this case I would expect the opposite because this change is entirely limited to functionality that OpenAPI explicitly doesn't support. It doesn't effect them at all.

handrews commented 4 years ago

@jdesrosiers

in this case I would expect the opposite because this change is entirely limited to functionality that OpenAPI explicitly doesn't support. It doesn't effect them at all.

Of course it does- JSON Schema questions often surface there, and the question of "is there a confusing kinda-sorta-overlap between items and additionalItems or is there just one keyword covering that entire set of functionality" will be something they have to deal with.

To me, it is unquestionable that an ideally named two-keyword solution is superior to an ideally named three-keyword solution. I'm not sure which is worse: when one keyword actually makes another superfluous (the case if we don't carve a weird exception out for additionaltems, which was how things were at some point in the past) or when it appears to make it superfluous but in fact doesn't work at all (which is how things are right now). Clearly, avoiding the overlap altogether is better.

However, no set of keyword names is perfect. We must choose which imperfection we can best work with, and OpenAPI spec and tooling people (as well as JSON Schema implementors outside of the OpenAPI ecosystem) will be impacted by our choice.

karenetheridge commented 4 years ago

The use of the word "sequence" in the array form of items is good. Grammatically, sequentialItems is better there.

As for additionalItems being erroneously (or at least nonsensically) combined with items (the object form), why not simply make it illegal? It is possible, with some combination of not and if/then/else, to outright forbid using items together with additionalItems. At the very least, there are other combinations of schema constructs which make no sense, so either they should all be illegal, or none of them are and their combined use is simply ignored.

(Or perhaps that would be better relegated to a linter tool, and/or a "strict" variant of the schema specification, e.g. where additionalProperties is false.)

handrews commented 4 years ago

@karenetheridge we avoid making keyword combinations illegal. They may be nonsensical, but you can write them without causing errors. We do not want to mandate that all schemas be validated against their meta-schema, nor do we want to require implementations to maintain a list of error conditions to check manually. Also, there are endless possible nonsensical combinations, and trying to enforce them all in the meta-schema would very quickly get out of hand.