MeltanoLabs / target-snowflake

Singer Target for the Snowflake cloud Data Warehouse
https://hub.meltano.com/loaders/target-snowflake--meltanolabs/
Other
10 stars 24 forks source link

If column is set to TIMESTAMP_NTZ in snowflake it cannot be converted to Date #130

Open visch opened 11 months ago

visch commented 11 months ago

Error:

NotImplementedError: Default TypeEngine.as_generic() heuristic method was unsuccessful for target_snowflake.snowflake_types.TIMESTAMP_NTZ. A custom as_generic() method must be implemented for this type class.

To reproduce

  1. Take any table that's already in Snowflake smoke-test works, run meltano run tap-smoke-test target-snowflake
    - name: tap-smoke-test
    variant: meltano
    pip_url: git+https://github.com/meltano/tap-smoke-test.git
    config:
      streams:
      - stream_name: animals
        input_filename: https://gitlab.com/meltano/tap-smoke-test/-/raw/main/demo-data/animals-data.jsonl
  2. Change one of the metadata fields that has the column type of TIMESTAMP_NTZ to DATE see below for an example
    UPDATE MELTANO_DATABASE.ZENOTI.ANIMALS  SET _SDC_EXTRACTED_AT_TEMP = TO_DATE(SDC_EXTRACTED_AT);
    ALTER TABLE MELTANO_DATABASE.ZENOTI.ANIMALS  DROP COLUMN _SDC_EXTRACTED_AT;
    ALTER TABLE MELTANO_DATABASE.ZENOTI.ANIMALS  RENAME COLUMN _SDC_EXTRACTED_AT_TEMP to _SDC_EXTRACTED_AT;
  3. Run it again meltano run tap-smoke-test target-snowflake You'll get the same error
pnadolny13 commented 11 months ago

I wonder if this is related to https://github.com/MeltanoLabs/tap-snowflake/pull/34. I believe the context is that the SDK treats all dates as datetimes.

visch commented 11 months ago

I'm not so sure. Converting between the two in snowflake (ignore the target) takes some migration as well like

UPDATE MELTANO_DATABASE.ZENOTI.ANIMALS  SET _SDC_EXTRACTED_AT_TEMP = TO_DATE(SDC_EXTRACTED_AT);
ALTER TABLE MELTANO_DATABASE.ZENOTI.ANIMALS  DROP COLUMN _SDC_EXTRACTED_AT;
ALTER TABLE MELTANO_DATABASE.ZENOTI.ANIMALS  RENAME COLUMN _SDC_EXTRACTED_AT_TEMP to _SDC_EXTRACTED_AT;

There's really 2 questions I think with this issue

  1. Should we migrate an existing TIMESTAMP_NTZ column to a Date Column if the schemas are different?
  2. If yes - We need to detect the difference and make an auto migration plan (lossless I think is rename the old column, migrate the data from the old column into the new column, use the new column going forward)
  3. If no (seems incorrect) - We need to convert the data from a timestamp to a date field

Similar to https://github.com/MeltanoLabs/target-snowflake/issues/120