json-schema-org / json-schema-spec

The JSON Schema specification
http://json-schema.org/
Other
3.66k stars 258 forks source link

Localization, Multilingual annotations (including enum), localized schema $id and their relation to $ref and $merge preprocessing #278

Closed ruifortes closed 6 years ago

ruifortes commented 7 years ago

Localization, Multilingual annotations (including enum) and their relation to $ref and $merge preprocessing

Please bare with me here because I'm still kinda lurking around this topic.

About schema localization

Since json-schema already has title and description I think there should be a localized version of every schema. Maybe it's $id uri would just have a lang or locale var like {$id: 'http:\\example.com\someschema.json?lang=pt'} or even multilingual ...?lang=en&lang=pt or even better ...?lang=en,pt.
Of course multilingual annotations values would be Language Tags.
So basicaly I'm proposing that schemas would be fetched already localized from the server and no extra ui-schema or translation object would be needed since title and descriptions should already be there to help understand the schemas. Also I think their text content should read well for client UI as for developers reading the schemas.

About $ref templating

I think $ref templating should be a dereferencer business. Json-schema shouldn't event have to deal with $ref and it should only be regarded as a preprocessing step before validation. Uri templates in links should only deal with instance data. External data could be passed to $ref in preprocessing but after that all $refs would be variable free.
Is there any use case for using $ref templates or link that use to data outside the instance?

Using $merge preprocessing has localization mechanism

My current method for translation is to use a separate folder for each language containing files with the same name as the original schema ones with only titles and descriptions in the same relative position so they can be simply merged into the original schema. To facilitate things I have a special folder that contains the entry point for each localized schema (and sub-schema) to merge it with the original. This files use $ref templates to reference the appropriate language translation files. This way the original schema doesn't have to deal with referencing language annotations.

{
    $merge: {
        source: {$ref: '{baseFolder}/someSchema.json'},
        input: {$ref: '{baseFolder}/translations/{lang}/someSchema.json'}
    }
}

Translations would also need to include $refs to other translations in the same place as the original schema and sub-schemas.

If the original schema is something like:

{
  $id: 'www.example.com/baseSchema.json',
  title: 'This is an exampe base schema',
  properties: {
    prop1: {type: 'string', title:'prop1', description:'This is the nuber one property'},
    sub1: {$ref: '#definitions/subschema1'}
    sub2: {$ref: '#definitions/subschema2'}
  },
  definitions: {
    subschema1: {$ref: 'subschema1.json'},
    subschema2: {$ref: 'subschema2.json'},
  }
}

translation file would be:

{
  $id: 'www.example.com/baseSchema.json?lang=pt',
  title: 'Este é um exemplo de um esquema base',
  properties: {
    prop1: {descrition: 'Esta é a propriedade numero 1'}
  },
  definitions: {
    subschema1: {$ref: 'subschema1.json'},
    // subschema1: {$ref: 'subschema2.json'},  //in json5 you could comment is theres no translations for subschema2 yet  
  }
}

The $merge would have to be performed after external refs are expanded. Of course dereferencing in the server adds to the message payload but this is a general problem that should be solved sending a cache with the original underefed schemas and derefing and merging in the client.

multilingual support would require more than just variable substitution in $ref. It would require a the loader to retrieve each translation file, wrap all annotation text in a language tag, merge then together and then merge re result with the schema.

Also in a single lang schema it would be trivial to wrap the base schema annotations in a language tag so when the required language annotation is missing instead of a string value in the original language it would have a language tag. This way it would be easy to identify items that are not yet translated

About named enumerations (#57)

I think any item in the enum array that is an object it should be considered to have the shape {value, title, description}. If an item is array it should just be merged in.
Also using an object instead of an array could be permited when order is not important or done by other means (ui-schema, maintaining order when parsing or including or including some "order field"). For simple string enumerations prop values could be "title" and prop names the value. Otherwise each prop value should have the same shape as the array version.
Translating enum annotations using the speced merge will only work with this late object version as rfc7386 states "If the patch is anything other than an object, the result will always be to replace the entire target with the entire patch. Also, it is not possible to patch part of a target that is not an object, such as to replace just some of the values in an array." Do you think it's reasonable to propose that if the merge target is an array but the input is an object in witch property names are integer string representations that it would merge into the item array with that integer index? In javascrit this is almost implicit has arrays are object but I also think it makes sense in the global merging context.

I posted all this in a single issue because I think all this is related to the localization issue.

Am I completely off here?

handrews commented 6 years ago

@ruifortes there is so much here that I'm really not sure what to do with this. I'll try to address a few points:

On localization, since your proposal is that it all happens out of sight on the server, there is nothing to do in the specification. JSON Schema does not put constraints on how you design your schema $id URIs.

I think $ref templating should be a dereferencer business. Json-schema shouldn't event have to deal with $ref and it should only be regarded as a preprocessing step before validation.

$ref MUST be evaluated lazily in order for circular references to work. Which is required for JSON Schema's own meta-schema to be possible. And what we discovered in past drafts is that if JSON Schema is not aware of $ref enough to describe where and how it can be used, then it becomes impossible (or at least awkward) for instances to use $ref.

Uri templates in links should only deal with instance data.

This is in your $ref section but I have no idea what it has to do with $ref at all. URI Templates are only used in hyper-schema.

External data could be passed to $ref in preprocessing but after that all $refs would be variable free. Is there any use case for using $ref templates or link that use to data outside the instance?

I have no idea what you mean here at all. How does $ref have variables now? What do link templates have to do with anything? What data outside the instance are you talking about- user input? What does user input have to do with $ref?

I'm not going to address $merge as it remains deeply controversial and will not be advanced in this draft.

The named enumerations proposal was rejected in favor of alternate, more flexible approaches, in particularly oneOf + const: https://github.com/json-schema-org/json-schema-spec/issues/57#issuecomment-247861695


If you want any of this to move forward, you need to break it up into clear, actionable, focused ideas. Or if you have a goal that needs several things to happen to meet it, start from the clearly defined goal before throwing up a whole bunch of semi-related feature ideas.

If you can clarify what's going on here and make it actionable, the discussion can continue in this issue. Otherwise I'm going to close this an encourage you to file proposals individually.

handrews commented 6 years ago

Closing per my last comment three weeks ago, given the lack of response. You are still encouraged to file clearly focused individual topics separately.

wanderingstan commented 6 years ago

For future reference: A more detailed discussion around JSON Schema Localization can be found here: https://github.com/json-schema-org/json-schema-spec/issues/53

(This issue is currently the first Google search result for json schema localization, and it took some more hunting to find the "real" issue.)