apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.32k stars 13.69k forks source link

Russian (Cyrillic) text is improperly displayed when using array_agg in the SQL query (Postgres) #22904

Open m-ocean-it opened 1 year ago

m-ocean-it commented 1 year ago

How to reproduce the bug

  1. Store Russian text in a Postgres database.
  2. Make sure you can fetch that textual data into Superset and properly display it via SQL lab or chart explorer.
  3. Modify the request: add a group by clause and use the array_agg function to create an array of strings of Russian text.
  4. Problem: the strings become "unicode gibberish". Example: ["\u041b\u043e\u043d\u0433\u0441\u043b\u0438\u0432\u044b", "\u0421\u0432\u0438\u0442\u0448\u043e\u0442\u044b"]. They get displayed like this in the SQL Lab, in the Chart Explorer and on the dashboard.

Environment

Checklist

Make sure to follow these steps before submitting your issue - thank you!

rusackas commented 1 year ago

This sounds different, yet potentially related to https://github.com/apache/superset/issues/19982

rusackas commented 1 year ago

@jinghua-qa / @ sadpandajoe do we have any tests using russian, chinese, etc? Not sure who's best suited to look into this with postres and other DBs to sort out the risk factors.

m-ocean-it commented 1 year ago

This sounds different, yet potentially related to #19982

It actually seems to be identical

m-ocean-it commented 1 year ago

19982:

when you click on the data it shows a popup window that displays the characters correctly, but when you download it to csv it shows up incorrectly.

Exactly same for me

G0Dzilla1984 commented 11 months ago

Hi, folks. We've got the same problem with array data from Clickhouse. In CH data stored as UTF-8 encoded values. "clickhouse_connect" driver returns correct data. All issues only in Superset interface. Can someone give me direction, where I can find bug?

Thanks.

sonfire186 commented 10 months ago

UP!

Irrichie commented 7 months ago

I've solved the same problem by converting data type into text (PostgreSQL): select array_agg(smth)::text

pixelky commented 7 months ago

+1

rusackas commented 1 month ago

While this is getting a lot of support (noting also that you can 👍 the original description rather than adding messages), it doesn't seem to have anyone investigating it. It's been open for over a year and a half, and at some point we'll have to close it in the name of steering toward an actionable backlog. Any takers? Meanwhile hopefully @dosu-bot can shed some light on the subject.