prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.07k stars 5.38k forks source link

Presto does not support Chinese by parquet #23969

Open i95271116 opened 2 weeks ago

i95271116 commented 2 weeks ago

For example : Select from table where col = '中文'. For Now I must write sql like this if field contains Chinese. Select from table where to_utf8(col) = to_utf8('中文')

my version is 0.234.2-add98eb

agrawalreetika commented 2 weeks ago

Hi @i95271116 Is it the same even with the latest presto as well?

i95271116 commented 2 weeks ago

I tested both presto-server-0.216 and presto-server-0.285.1, but the problem still exists.

i95271116 commented 2 weeks ago

In the where condition, for Chinese characters, "where trim(‘column_name’)"=‘切断压力’ needs to be added

i95271116 commented 2 weeks ago

Or it is normal to use "SELECT * from table where alias='切断压力' or rand() = 1 limit 10".

hantangwangd commented 2 weeks ago

Hi @i95271116. Can you elaborate the issue? For example, which connector do you use, and what are the steps to reproduce the problem.

I cannot reproduce it on Iceberg connector with parquet format.