openownership / data-standard

The Beneficial Ownership Data Standard (BODS) is an open standard providing a specification for modelling and publishing information on the beneficial ownership and control of corporate vehicles
http://standard.openownership.org
Other
59 stars 13 forks source link

Decide how multiple applicable schema values should be handled #621

Open kd-ods opened 5 months ago

kd-ods commented 5 months ago

Summary of the bug or issue

We have properties like this one in the entity-record.json schema file:

    "jurisdiction": {
      "title": "Jurisdiction",
      "description": "The jurisdiction in which this entity was registered or created (for legal and registered entities, and arrangements). Or the state's jurisdiction (for states and state bodies).",
      "propertyOrder": 15,
      "$ref": "urn:components#/$defs/Jurisdiction"
    }

When the ref is resolved, there are mutliple values for description:

"The jurisdiction in which this entity was registered or created (for legal and registered entities, and arrangements). Or the state's jurisdiction (for states and state bodies)."

and

"A Jurisdiction MUST have a name. A jurisdiction SHOULD have a 2-letter country code (ISO 3166-1) or a subdivision code (ISO 3166-2)."

This is fine, but the 2020-12 spec is clear that it is up to individual applications to decide how to handle these multiple values.

I'm not entirely sure how the sphinx directives like '.. json-value::' on the references.rst page work, but somewhere along the line a decision will be being made about multiple description values. Likewise for docson.

We've also come across handling of mulitple values wrt an enum, with this bit of schema in entity-record.json:

      "items": {
        "$ref": "urn:components#/$defs/Address",
        "properties":{
          "type":{
            "enum":[
              "registered",
              "business",
              "alternative"
            ]
          }
        }
      },

Since this schema 'works' against the test data, do we have to assume that there is an implicit and default decision within the Draft202012Validator constructor about how to approach multiple values?

My primary motivation is from editing the schema descriptions... I want to be clear about how multiple descriptions are handled. In version 0.3 and prior: we overwrote the more general applicable description (i.e. that provide via a $ref) with the more specific description. So in the above example, the first description for jurisdiction would be used.

Suggested resolution

Wrt descriptions, I actually quite like the idea of appending the more generally applicable description to the more particular description. So for the jurisdiction field in the entity record details you'd get:

"The jurisdiction in which this entity was registered or created (for legal and registered entities, and arrangements). Or the state's jurisdiction (for states and state bodies). A Jurisdiction MUST have a name. A jurisdiction SHOULD have a 2-letter country code (ISO 3166-1) or a subdivision code (ISO 3166-2)."

kd-ods commented 5 months ago

@rhiaro - assigning to you for your reckons. Is my understanding above correct?? Is there a fundamental 'compiler' or constructor for a complex schema like BODS within which we can instantiate all decisions about mutliple-value-handling behaviour... ?

kd-ods commented 4 months ago

There is an instance of this issue wrt docson. See '3. Overlapping schemas problems' here: #682