Open maku2903 opened 2 years ago
I'm trying to prove my point: Conclusion: dpg gets char hex as cp1250 F.e. letter "Ą" is exactly hex 0xA5 in cp1250 according to https://en.wikipedia.org/wiki/Windows-1250 and screenshot of example confirms that.
EDIT: trying to find temp solution (convert: hex->bytes->decode in system cp)
import dearpygui.dearpygui as dpg
path = 'font_path.ttf'
with dpg.font_registry():
with dpg.font(path, 20, default_font=True):
dpg.add_font_range(0x0100, 0x017f)
# dpg.add_char_remap(0x00a5, 0x0104)
import locale
PREFERENCED_ENCODING = locale.getpreferredencoding()
def repair_encoding(s: str) -> str:
if PREFERENCED_ENCODING == 'utf-8' or len(s) == 0:
return s
else:
return ''.join(bytes.fromhex(hex(ord(x))[2:]).decode(PREFERENCED_ENCODING) for x in s)
def to_hex(s: str):
return ' '.join(hex(ord(char)) for char in s)
def repair_callback(sender, a, u):
text_id = u[0]
hex_text_id = u[1]
text_rep_id = u[2]
hex_text_rep_id = u[3]
val = dpg.get_value(item=sender)
dpg.configure_item(item=text_id, default_value=val)
dpg.configure_item(item=hex_text_id, default_value=to_hex(val))
repair_val = repair_encoding(val)
dpg.configure_item(item=text_rep_id, default_value=repair_val)
dpg.configure_item(item=hex_text_rep_id, default_value=to_hex(repair_val))
with dpg.window(label="Main window", width=500, height=400) as main_window_id:
dpg.set_primary_window(main_window_id, True)
dpg.add_text(default_value=f'Code page: {PREFERENCED_ENCODING}')
text_id = dpg.generate_uuid()
hex_text_id = dpg.generate_uuid()
text_rep_id = dpg.generate_uuid()
hex_text_rep_id = dpg.generate_uuid()
chars = 'ĄąĘꏟŻż'
dpg.add_input_text(
callback=lambda s, a, u: [repair_callback(s, a, u)]
, user_data=(text_id, hex_text_id, text_rep_id, hex_text_rep_id)
, label='Input this: ' + chars
)
dpg.add_separator()
dpg.add_text(default_value='Should be:')
dpg.add_text(default_value=chars)
dpg.add_text(default_value=to_hex(chars))
dpg.add_separator()
dpg.add_text(default_value='Is:')
dpg.add_text(id=text_id, default_value='', label='Val from input text')
dpg.add_text(id=hex_text_id, default_value='', label='Hex val from input text')
dpg.add_separator()
dpg.add_text(default_value='Repaired:')
dpg.add_text(id=text_rep_id, default_value='', label='Val from input text')
dpg.add_text(id=hex_text_rep_id, default_value='', label='Hex val from input text')
dpg.add_separator()
dpg.start_dearpygui()
And with Russian the same problems. When it will be fixed?
@thainik Hi, I'm not a font expert but russion I believe you need to remap the characters, see this. This is issue is the exact reason we added add_char_remap(...)
. We would love for one of the russian users to provide a complete remapping script so we can add it in but we have not seen it yet!
Let me know how it goes.
@thainik Hi, I'm not a font expert but russion I believe you need to remap the characters, see this. This is issue is the exact reason we added
add_char_remap(...)
. We would love for one of the russian users to provide a complete remapping script so we can add it in but we have not seen it yet!Let me know how it goes.
@hoffstadt All this working fine, and everything rendering according to what you code, what char glyph you remap etc. But the thing is that when you type into add_input_text from keyboard, and your system keyboard language not in Basic Latin(U+0020 - U+007E), DearPyGui assume everithing is in Latin-1 Supplement (U+00A0 - U+00FF) and ignore keyboard language mapping, which now may be in cyrillic unicode block for example. So get_value return string where all chars is in range U+0020 - U+00FF, and we need every time to translate Latin-1 Supplement chars to another unicode chars, according to language we currently use. And this situation is really sad. When you copy from somewhere into add_input_text or when you set_value from your code - all unicodes are exactly the same as it has to bee.
Your product is really great, and want to believe that it becomes perfect in some time.
@thainik Hi, I'm not a font expert but russion I believe you need to remap the characters, see this. This is issue is the exact reason we added
add_char_remap(...)
. We would love for one of the russian users to provide a complete remapping script so we can add it in but we have not seen it yet! Let me know how it goes.@hoffstadt All this working fine, and everything rendering according to what you code, what char glyph you remap etc. But the thing is that when you type into add_input_text from keyboard, and your system keyboard language not in Basic Latin(U+0020 - U+007E), DearPyGui assume everithing is in Latin-1 Supplement (U+00A0 - U+00FF) and ignore keyboard language mapping, which now may be in cyrillic unicode block for example. So get_value return string where all chars is in range U+0020 - U+00FF, and we need every time to translate Latin-1 Supplement chars to another unicode chars, according to language we currently use. And this situation is really sad. When you copy from somewhere into add_input_text or when you set_value from your code - all unicodes are exactly the same as it has to bee.
Your product is really great, and want to believe that it becomes perfect in some time.
Hello. Don 't you have an example ? how you solved this problem. Also faced with this problem. Your product is good.
Hello. Don 't you have an example ? how you solved this problem. Also faced with this problem. Your product is good.
The simpliest - chr(ord(input_text_char) + offset_according_to_unicode)
@maku2903 Thank you for your solution. Though I've updated it a bit, so it works when using input_text.set_value with already repaired string.
import locale
PREFERENCED_ENCODING = locale.getpreferredencoding()
def repair_encoding(s: str) -> str:
if PREFERENCED_ENCODING == 'utf-8' or len(s) == 0:
return s
else:
return ''.join(bytes.fromhex(hex(ord(x))[2:]).decode(PREFERENCED_ENCODING) if ord(x)<=255 else x for x in s)
usage example would be something like this:
input = dpg.add_input_text()
#input some text
input_text = repair_encoding(dpg.get_value(input))
#decide to edit text
dpg.set_value(input, input_text) #displays correctly
#edit or do nothing with input
edited_text = repair_encoding(dpg.get_value(input))
Now it's not breaking here due to presense of already decoded chars
Version of Dear PyGui
Version: 0.8.39 Operating System: Affected: Windows 10 Not affected: Fedora 34 Not tested: Mac
My Issue/Question
Using custom char range for "Latin Extended-A" from: https://en.wikipedia.org/wiki/List_of_Unicode_characters#Latin-1_Supplement [0x0100, 0x017f] Input chars are getting messed up... Issue affects only Windows 10. Issue doesnt affect Fedora 34 on the same machine and same keyboard.
I dont know how to verify my hypothesis: DPG doesnt take into consideration system default encoding.
I'm using Windows with cp1250 encoding {command: [System.Text.Encoding]::Default
![obraz](https://user-images.githubusercontent.com/72229704/125990423-c85ea698-d5fb-404b-908f-7ddfa5ce4b03.png)
Fedora encoding:
Question to the Author: does DPG take into consideration default system encoding? Not every system use 'utf-8'. Maybe thats the problem?
When copy-pasting chars from f.e. notepad.exe everything is ok. add_char_remap(wrong_char, correct_char) fixes problem but every wrong_char will be wrong when actually used elsewhere...
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Input of correct chars on any system.
Screenshots/Video
Standalone, minimal, complete and verifiable example