ClickHouse / clickhouse-odbc

ODBC driver for ClickHouse
https://clickhouse.tech
Apache License 2.0
245 stars 86 forks source link

Unicode conversion rework #278

Closed traceon closed 4 years ago

traceon commented 4 years ago
Enmk commented 4 years ago

well, few questins:

driver/utils/conversion_std.h (MSVC only)

Why not conversion_msvc.h ?

conversions are performed via pivot encoding

What is pivot encoding here? Do I get it right, that you usually convert via some intermediate encoding? If yes, why?

traceon commented 4 years ago

Why not conversion_msvc.h ?

Because it can be still used with any other compiler, with one single macro switch.

What is pivot encoding here? Do I get it right, that you usually convert via some intermediate encoding? If yes, why?

ICU's converter-of-X only converts from X to its pivot, and from its pivot to X. ICU uses a hardcoded pivot, which is UTF-16 in UChar. Changing it to UTF-8 may speed up things in our case, but will require custom built ICU. This is planned for the next change. Everything inside the driver is represented in UTF-8.