outrauk / dataiku-plugin-snowflake-hdfs

DSS plugin for fast loading between Snowflake and HDFS
https://github.com/outrauk/dataiku-plugin-snowflake-hdfs
MIT License
0 stars 0 forks source link

fix(AOBS-491): Use native sfType instead of originalType, and fix schema-less tables #17

Closed mklaber closed 4 years ago

mklaber commented 4 years ago

This PR fixes the issue with unscores in the originalType parameter while keeping true to the intent of originalType.

It also applies flake8 formatting rules.

Fixes #19

Closes #18

mklaber commented 4 years ago

@czen88 I'm not sure why, but GitHub is preventing me from replying to a couple of your comments. Regarding this one:

If you aim to produce Dataiku datatypes from Snowflake datatypes - the code is incomplete. You need to implement mapping for other datatypes as well. Otherwise results of this method are not reliable. REPLACE(datatype, '', '') does not handle all datatypes imho.

Can you provide an example? I'm not sure what you mean—there is no filter on data_type, so it will inherently map all datatypes.

czen88 commented 4 years ago

@czen88 I'm not sure why, but GitHub is preventing me from replying to a couple of your comments. Regarding this one:

If you aim to produce Dataiku datatypes from Snowflake datatypes - the code is incomplete. You need to implement mapping for other datatypes as well. Otherwise results of this method are not reliable. REPLACE(datatype, '', '') does not handle all datatypes imho.

Can you provide an example? I'm not sure what you mean—there is no filter on data_type, so it will inherently map all datatypes.

If you run this SQL:

SELECT DISTINCT data_type
FROM information_schema.COLUMNS
WHERE table_name = 'PROPERTY_COMBINED'
AND table_schema = 'PUBLIC'

you will get these datatypes: image

This is different what you see in Dataiku for same table: https://dataiku.outra.co.uk/projects/PROPERTY_CLEAN/datasets/property_combined_sf/settings/#schema ['NUMBER', 'VARCHAR', 'BOOLEAN', 'DOUBLE', 'TIMESTAMPLTZ']