datacontract / datacontract-cli

CLI to manage your datacontract.yaml files
https://cli.datacontract.com
Other
355 stars 63 forks source link

Decimal precision not supported - databricks #304

Open kiranskmr opened 2 days ago

kiranskmr commented 2 days ago

Tested the validation against databricks decimal(16,4) data type. But when adding as decimal(16,4) in the yaml definition it was not accepted.

data.models.customers.fields.yyy.xxx must be one of ['number', 'decimal', 'numeric', 'int', 'integer', 'long', 'bigint', 'float', 'double', 'string', 'text', 'varchar', 'boolean', 'timestamp', 'timestamp_tz', 'timestamp_ntz', 'date', 'array', 'object', 'record', 'struct', 'bytes', 'null']

When running tests it was throwing the type mismatch error.

failed │ Check that field xxx has type DECIMAL │ Type Mismatch, Expected Type: DECIMAL; Actual │ Type: decimal(16,4)

jochenchrist commented 2 days ago

The field type only accepts logical types. You can set the precision and scale separately.

Try this:

fields:
  yyy
    type: decimal
    precision: 16
    scale: 4

also, you can define the physical type as a config option:

fields:
  yyy
    type: decimal
    precision: 16
    scale: 4
    config:
        databricksType: decimal(16,4)
kiranskmr commented 1 day ago

Thanks for the update. Doesn't it need the physical type to run tests?

jochenchrist commented 1 day ago

The CLI maps to the physical types, based on the server type. You can use the config to overwrite these mapped types (e.g. if you use VARCHAR, instead of STRING)