aws / amazon-redshift-odbc-driver

Apache License 2.0
18 stars 7 forks source link

Unicode column values are corrupted #24

Open alainbryden opened 3 months ago

alainbryden commented 3 months ago

Version

Problem description

When upgrading from driver version 1.5.9 -> 2.1.3, I saw that unicode strings were not coming through correctly. The old driver installer had a configuration item to enable or disable unicode support: image

But this new driver has no such option.

I have the following test which I ran before and after the upgrade:

SELECT * FROM OPENQUERY([REDSHIFT], 'SELECT ''ぬるを'' AS UnicodeTest_openquery')
EXEC('SELECT ''ぬるを'' AS UnicodeTest_exec') AT [REDSHIFT]

On version 1.5.9, this returns the expected result:

るを_unicodetest_openquery
ぬるを
るを_unicodetest_exec
ぬるを

image

On 2.1.3, the column names survive, but the values are mangled:

るを_unicodetest_openquery
ぬるを
るを_unicodetest_exec
ぬるを

image

We were upgrading to hopefully take advantage of ZSTD compression, but we use global datasets, so this regression is holding us back.

vahid110 commented 3 months ago

Hi @alainbryden Thank you for taking the time to report this issue. I appreciate your effort and will make sure it reaches the appropriate team for investigation as soon as possible.

vahid110 commented 1 month ago

@alainbryden After discussing with my team, it turns out we have to extend our driver to support this feature. I will keep this ticket open for further updates. Thanks