aisingapore / TagUI

Free RPA tool by AI Singapore
Apache License 2.0
5.67k stars 584 forks source link

Handling Chinese characters in TagUI script when using py_step - try this print with UTF-8 encoding #1409

Closed lj1029 closed 3 weeks ago

lj1029 commented 1 month ago

My script works fine with English text and numbers, but it gets stuck at the py_step when I use Chinese characters. The script is as follows:

a = "啊"
b = 2
c = {"a":a, "b":b}
json_c = JSON.stringify(c)
echo 1
py_step("json_c = '" + json_c + "'")
echo 2
py begin
import json
c = json.loads(json_c)
d = c["a"]
e = c["b"] * 10
f = {"d":d, "e":e}
json_f = json.dumps(f, ensure_ascii=False)
print(json_f)
py finish
echo `py_result`

Could someone please help me understand why this is happening and how I can resolve it? Your assistance is greatly appreciated.

kensoh commented 4 weeks ago

Hi @lj1029 you can change to below to encode as UTF-8, otherwise printing the Chinese characters will throw Python error UnicodeEncodeError: 'ascii' codec can't encode character u'\u554a' in position 16: ordinal not in range(128)

print(json_f.encode('utf-8'))
yytuwefds commented 3 weeks ago

Hi , I also encounter the same problem. And it seems that add encode(utf-8) doesn't works

This picture shows the output of the code before it stuck

屏幕截图 2024-10-27 161110

the origin code is as following

屏幕截图 2024-10-27 161238

Inaddition I have config the env, so the console can output chinese as utf-8

屏幕截图 2024-10-27 161610
kensoh commented 3 weeks ago

Oh @yytuwefds it works for me on macOS. Without encode() I get error. With encode() it's ok. Can you share the log file at tagui/src/tagui_py? There is a log file of the real error from Python.

yytuwefds commented 3 weeks ago
屏幕截图 2024-10-28 110051

Thanks for response. I don't know where to find logs before. Here is the log file. It seems that it's using gbk to read Chi characters in python? I'using V6.110.0 edition in the release named TagUI_Windows.zip. Is it possible to change the config?

yytuwefds commented 3 weeks ago

solved. windows python default use gbk in python. modify 83 line in tagui_py.py as tagui_input = open('tagui_py/tagui_py.in','r', encoding='utf-8') in this way, encode is also not needed to prevent such problem

lj1029 commented 3 weeks ago

Hi @kensoh , @yytuwefds Thank you for your response. I have tried modify 83 line in tagui_py.py, it works and won't stuck at py_step. But when I printing the result, I have to encode('utf8').decode('gbk'), to prevent getting garbled text.

py begin
print('你好') //garbled
print('你好').encode('utf8').decode('gbk')
py finish
echo `py_result`

image