K2InformaticsGmbH / oranif

Oracle OCI driver using dirty NIF
Apache License 2.0
4 stars 2 forks source link

Strange result from data_get for unicode extended characters. #122

Closed acautin closed 5 years ago

acautin commented 5 years ago

I am reading a table which contains umlauts, and also Chinese characters and getting strange results back from the driver.

Original text: 汉字 Expected utf-8: <<230,177,137,229,173,151>> Actual bytes read: <<195,166,194,177,194,137,195,165,194,173,194,151>>

Original text: äëïöü ñç Expected utf-8: <<195,164,195,171,195,175,195,182,195,188,32,195,177,195,167>> Actual bytes read: <<195,131,194,164,195,131,194,171,195,131,194,175,195,131,194,182,195,131,194,188,32,195,131,194,177,195,131,194,167>>

Question is if my understanding of the encoding is correct or maybe I am missing a step, the returned bytes seems to be encoded twice? but I have my doubts this would be done by the driver so maybe some step is missing.

acautin commented 5 years ago

False alarm, data inserted was invalid as there is a bug when getting default charset for the connection in dderl.

https://github.com/K2InformaticsGmbH/dderl/issues/603

Thanks @c-bik and @KarlKeiser for the support.