frictionlessdata / datapackage

Data Package is a standard consisting of a set of simple yet extensible specifications to describe datasets, data files and tabular data. It is a data definition language (DDL) and data API that facilitates findability, accessibility, interoperability, and reusability (FAIR) of data.
https://datapackage.org
The Unlicense
493 stars 112 forks source link

feat(schemas): Change `anyOf` to `oneOf` #828

Closed spencerwilson closed 1 year ago

spencerwilson commented 1 year ago

Refining this type gives a much better codegen experience. Using typify with anyOf yields:

pub struct TableSchemaField {
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_0: Option<StringField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_1: Option<NumberField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_2: Option<IntegerField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_3: Option<DateField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_4: Option<TimeField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_5: Option<DateTimeField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_6: Option<YearField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_7: Option<YearMonthField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_8: Option<BooleanField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_9: Option<ObjectField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_10: Option<GeoPointField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_11: Option<GeoJsonField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_12: Option<ArrayField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_13: Option<DurationField>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_14: Option<AnyField>,
}

whereas with oneOf it yields:

pub enum TableSchemaField {
    StringField {
        #[serde(default, skip_serializing_if = "Option::is_none")]
        constraints: Option<Constraints>,
        #[doc = "A text description. Markdown is encouraged."]
        #[serde(default, skip_serializing_if = "Option::is_none")]
        description: Option<String>,
        #[doc = "An example value for the field."]
        #[serde(default, skip_serializing_if = "Option::is_none")]
        example: Option<String>,
        #[doc = "The format keyword options for `string` are `default`, `email`, `uri`, `binary`, and `uuid`."]
        #[serde(default = "defaults::table_schema_field_string_field_format")]
        format: StringFieldFormat,
        #[doc = "A name for this field."]
        name: String,
        #[doc = "The RDF type for this field."]
        #[serde(rename = "rdfType", default, skip_serializing_if = "Option::is_none")]
        rdf_type: Option<String>,
        #[doc = "A human-readable title."]
        #[serde(default, skip_serializing_if = "Option::is_none")]
        title: Option<String>,
        #[doc = "The type keyword, which `MUST` be a value of `string`."]
        #[serde(rename = "type", default, skip_serializing_if = "Option::is_none")]
        type_: Option<StringFieldType>,
    },
    NumberField {
        #[doc = "a boolean field with a default of `true`. If `true` the physical contents of this field must follow the formatting constraints already set out. If `false` the contents of this field may contain leading and/or trailing non-numeric characters (which implementors MUST therefore strip). The purpose of `bareNumber` is to allow publishers to publish numeric data that contains trailing characters such as percentages e.g. `95%` or leading characters such as currencies e.g. `€95` or `EUR 95`. Note that it is entirely up to implementors what, if anything, they do with stripped text."]
        #[serde(rename = "bareNumber", default = "defaults::default_bool::<true>")]
        bare_number: bool,
        #[serde(default, skip_serializing_if = "Option::is_none")]
        constraints: Option<Constraints>,
        #[doc = "A string whose value is used to represent a decimal point within the number. The default value is `.`."]
        #[serde(
            rename = "decimalChar",
            default,
            skip_serializing_if = "Option::is_none"
        )]
        decimal_char: Option<String>,
        #[doc = "A text description. Markdown is encouraged."]
        #[serde(default, skip_serializing_if = "Option::is_none")]
        description: Option<String>,
        #[doc = "An example value for the field."]
        #[serde(default, skip_serializing_if = "Option::is_none")]
        example: Option<String>,
        #[doc = "There are no format keyword options for `number`: only `default` is allowed."]
        #[serde(default = "defaults::table_schema_field_number_field_format")]
        format: NumberFieldFormat,
        #[doc = "A string whose value is used to group digits within the number. The default value is `null`. A common value is `,` e.g. '100,000'."]
        #[serde(rename = "groupChar", default, skip_serializing_if = "Option::is_none")]
        group_char: Option<String>,
        #[doc = "A name for this field."]
        name: String,
        #[doc = "The RDF type for this field."]
        #[serde(rename = "rdfType", default, skip_serializing_if = "Option::is_none")]
        rdf_type: Option<String>,
        #[doc = "A human-readable title."]
        #[serde(default, skip_serializing_if = "Option::is_none")]
        title: Option<String>,
        #[doc = "The type keyword, which `MUST` be a value of `number`."]
        #[serde(rename = "type", default, skip_serializing_if = "Option::is_none")]
        type_: Option<NumberFieldType>,
    },
    // etc.

In the first case the type allows for a TableSchemaField to contain multiple field definitions, each under opaque field names like subtype_0. In the second case the generator can leverage the fact that the different kinds of definitions are mutually exclusive, and generate an enum type that's much easier to program with.