MeltanoLabs / target-snowflake

Singer Target for the Snowflake cloud Data Warehouse
https://hub.meltano.com/loaders/target-snowflake--meltanolabs/
Other
10 stars 24 forks source link

jsonschema validator fails on datetimeoffset / date-time #163

Open s-jorgenfroland opened 8 months ago

s-jorgenfroland commented 8 months ago

We are using Meltano tap-mssql to extract data from a database and target-snowflake to ingest data to Snowflake. After a deploy yesterday, we started facing an issue we have a hard time understanding. Therefore we want to reach out here to see if there is a bug or if anyone have suggestions to solving this. We are using our own version of the plugin, built on target-snowflake with a few changes, but no changes related to validation of schema.

The issue:

tap-mssql generates this schema:

{"type": "SCHEMA", "stream": "dwh-tablename", "schema": {"properties": {"CONTACTID": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "PRODUCTCODE": {"inclusion": "available", "type": ["null", "string"]}, "LASTCHANGEDDATETIMEOFFSET": {"inclusion": "available", "format": "date-time", "type": ["null", "string"], "additionalProperties": {"sql_data_type": "datetimeoffset"}}, "CLIENTID": {"inclusion": "available", "type": ["null", "string"]}, "SUBLEDGERENTRYID": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "CURRENCYAMOUNT": {"inclusion": "available", "multipleOf": 0.01, "type": ["null", "number"]}, "POSTINGDATE": {"inclusion": "available", "format": "date-time", "type": ["null", "string"], "additionalProperties": {"sql_data_type": "datetimeoffset"}}, "CUSTOMERACCOUNTNO": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "SUPPLIERACCOUNTNO": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "VATAMOUNT": {"inclusion": "available", "multipleOf": 0.01, "type": ["null", "number"]}, "VOUCHERID": {"inclusion": "available", "type": ["null", "string"]}, "DEPARTMENTID": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "VOUCHERNO": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "VATID": {"inclusion": "available", "minimum": -2147483648, "maximum": 2147483647, "type": ["null", "integer"]}, "AMOUNT": {"inclusion": "available", "multipleOf": 0.01, "type": ["null", "number"]}, "ACCOUNTNO": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "REVERSEDVOUCHERNO": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "ROWLASTUPDATE": {"inclusion": "available", "format": "date-time", "type": ["null", "string"], "additionalProperties": {"sql_data_type": "datetimeoffset"}}, "QUANTITY": {"inclusion": "available", "multipleOf": 0.01, "type": ["null", "number"]}, "PRODUCTID": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "VATCODE": {"inclusion": "available", "type": ["null", "string"]}, "VOUCHERDESCRIPTION": {"inclusion": "available", "type": ["null", "string"]}, "VATRATE": {"inclusion": "available", "multipleOf": 0.01, "type": ["null", "number"]}, "PROJECTID": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "PROJECTCODE": {"inclusion": "available", "type": ["null", "string"]}, "DEPARTMENTCODE": {"inclusion": "available", "type": ["null", "string"]}, "DESCRIPTION": {"inclusion": "available", "type": ["null", "string"]}, "CURRENCYCODE": {"inclusion": "available", "type": ["null", "string"]}, "VOUCHERDATE": {"inclusion": "available", "format": "date-time", "type": ["null", "string"], "additionalProperties": {"sql_data_type": "datetimeoffset"}}, "EMPLOYEEACCOUNTNO": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "VOUCHERTYPE": {"inclusion": "available", "type": ["null", "string"]}, "ID": {"inclusion": "available", "minimum": -9223372036854775808, "maximum": 9223372036854775807, "type": ["null", "integer"]}, "CREATEDDATETIMEOFFSET": {"inclusion": "available", "format": "date-time", "type": ["null", "string"], "additionalProperties": {"sql_data_type": "datetimeoffset"}}}, "type": "object"}, "key_properties": ["ID"], "bookmark_properties": ["RowLastUpdate"]}

One of the columns that fails is this:

"POSTINGDATE": {"inclusion": "available", "format": "date-time", "type": ["null", "string"], "additionalProperties": {"sql_data_type": "datetimeoffset"}}

The issue occurs during execution of target-snowflake and this is the error-message:

File "/install/opt/prefect/meltano/pogo/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/singer_sdk/sinks/core.py", line 317, in _validate_and_parse self._validator.validate(record) File "/install/opt/prefect/meltano/pogo/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/jsonschema/validators.py", line 438, in validate raise error jsonschema.exceptions.ValidationError: '2024-03-01T00:00:00' is not a 'date-time' Failed validating 'format' in schema['properties']['POSTINGDATE']: { 'additionalProperties': {'sql_data_type': 'datetimeoffset'}, format': 'date-time', inclusion': 'available', type': ['null', 'string'] } On instance['POSTINGDATE']: 2024-03-01T00:00:00'

This is how the value of POSTINGDATE looks like in the record (other columns are removed):

{"type": "RECORD", "stream": "dwh-tablename", "record": {...., "POSTINGDATE": "2022-01-31T00:00:00", ....}

Anyone have an idea to why we are getting this error?

s-jorgenfroland commented 8 months ago

I tested upgrading singer-sdk to version 0.35.0 in target-snowflake and now it works.

edgarrmondragon commented 8 months ago

@s-jorgenfroland thanks for confirming. Do give version 0.7.0 version (just release) a try, since it references the latest singer-sdk (0.36.1).