Snowflake-Labs / Excelerator

This is an Excel Addin for Windows that reads and writes data to Snowflake
Apache License 2.0
78 stars 22 forks source link

Invalid UTF8 Detected in string when using data sourced from SQL Server 2008R2 #7

Closed rcaddell closed 3 years ago

rcaddell commented 3 years ago

Data enters Snowflake via Fivetran using the SQL Server connector. Source SQL Server Collation is SQL_Latin1_General_CP1_CI_AS. Querying using the connector is fine, but when I upload, I receive an Invalid UTF8 error; I believe due to the source systems collation.

I have a named stage with a defined file format for Excelerator. I've set the following parameters in my file format with no success:

replace_invalid_characters = true validate_utf8 = false

I created and populated a table with all the values enclosed in double quotes and set that parameter on my file format but receive the same error.

All works fine when I use data funneled from another cloud source such as Salesforce.

ssegal100 commented 3 years ago

Hi @rcaddell. To make sure I understand. After loading the data from Fivetran into Snowflake, you then queried Snowflake with Excelerator. Did you try and upload that same file that you just downloaded using Excelerator, or was it a different file?

rcaddell commented 3 years ago

Thanks for the quick reply @ssegal100. I have the data loaded to Snowflake via Fivetran. I'm then creating another final table which is fed from those source tables( i.e. create table dev_randy.excelerator_test as select * from fivetran.sqlserver.account limit 10). I'm then querying that new excelerator_test table with Excelerator. I'll return the ten rows no problem. I renamed a few accounts and when I upload, it gives me the attached error.

Following the same process but with Salesforce data coming to Snowflake via Fivetran works no problem, which leads me to believe this may be related to my ultimate source system being so old. Thats when I added a named stage and defined a file format specifically but have always returned the UTF error.

excelerator

ssegal100 commented 3 years ago

@rcaddell What time zone are you in? Can you hop on a zoom call later today? If you remove the one value it's failing on, does it then work? Can you try and use the 'Copy' command to load the file? It should give you the same error, but it's good to validate.

rcaddell commented 3 years ago

@ssegal100 I am EST. I can jump on a call whenever is good for you short or 3PM-330PMEST.

If I remove some of the longer columns columns that are erring it does succeed. I am noticing that those columns do have values in there such as "c/o", which seemingly are not presented correctly on upload.

I placed the file in the stage and ran a copy, still erred, but not as verbose:

excelerator_copy

rcaddell commented 3 years ago

Looks like I am not restricted to SQL sources for this. Tested the same record coming from a Cloud system and its something in the construct of some of our data.

ssegal100 commented 3 years ago

OK, thanks for the update.

rcaddell commented 3 years ago

closing as I have some more testing to do