Open ralesk opened 5 years ago
No, this is not a keyboard input issue. Not related to #1041.
"When I normally write in all the rest of the programs, if I hit ' and e, I see é.
However, in Telegram app it appears the e without the tilde."
That issue is about keyboard input, and in particular compose key (the X11 way of having multiple keystrokes result in a single letter) and/or a dead key (another way of having you press a sequence of keys to end up with a single letter) not being honoured by the input widget in Telegram and/or Telegram's Qt.
This issue is about character sequences (as opposed to keypress sequences), where you have literal characters in the paste buffer and Telegram or Telegram's Qt stripping so-called combining characters, which do not appear in the other issue whatsoever.
So?
Some combining diacritics get stripped. What, why, how. (Probably a Qt issue?)
o̿wo̿ gets stripped and rendered as o w o — note the space uvͮu doesn't get stripped and is rendered as is
Anyway, let's look at the entire combining range for shits and giggles:
binmode STDOUT, "encoding(utf-8)";
for (0x0300 .. 0x036f) {
print sprintf("U+%04x", $_)." a".chr($_)."x ";
print "\n" if $_ % 4 == 3;
}
This renders perfectly (as far as the fonts allow) in Discord:
And there are multiple things that happen in Telegram:
Note how these are still good (except for the a + double grave) in the input box before sending... and they're mangled after sending (including when trying to edit again):
Here's it with fixed width so it's easier to spot (with fewer spaces):
I wonder what is so special about code points U+030A, U+0333 and U+033F that Telegram or Qt mangles them. I wonder if there are any more Unicode characters out there that get this treatment.
P.S. considering Konsole (a Qt/KDE terminal app) doesn't mess it up, and neither do Clementine or Gwenview, maybe it's not a Qt issue afterall...
They're not getting stripped, just rendered as a whitespace. You can successfully copy the incorrectly rendered text and paste it in another application, retaining all the "stripped" diacritics.
I have just copied that message to here in this Github entry box and the accent is not present, whereas copying it from Discord (where it doesn't get mangled) works. So no, it's not a display issue, the character is getting replaced by a whitespace.
Of course since @Aokromes has mistakenly closed it and still hasn't reopened it, it has even less of a chance ever getting noticed, not that anything ever gets noticed here anyway.
I wonder what is so special about code points U+030A, U+0333 and U+033F that Telegram or Qt mangles them.
These code points present in IsReplacedBySpace method: https://github.com/desktop-app/lib_ui/blob/d4c99701b5210a2db83b1c0f13da1a62f48dfb80/ui/text/text.cpp#L3444-L3457
I found this ticket by great accident
Thank you! Feels good to be proven right.
Hey there!
This issue was inactive for a long time and will be automatically closed in 30 days if there isn't any further activity. We therefore assume that the user has lost interest or resolved the problem on their own.
Don't worry though; if this is an error, let us know with a comment and we'll be happy to reopen the issue.
Thanks!
The issue is still present.
Hey there!
This issue was inactive for a long time and will be automatically closed in 30 days if there isn't any further activity. We therefore assume that the user has lost interest or resolved the problem on their own.
Don't worry though; if this is an error, let us know with a comment and we'll be happy to reopen the issue.
Thanks!
Still having this issue
Hey there!
This issue was inactive for a long time and will be automatically closed in 30 days if there isn't any further activity. We therefore assume that the user has lost interest or resolved the problem on their own.
Don't worry though; if this is an error, let us know with a comment and we'll be happy to reopen the issue.
Thanks!
The issue is still there.
Hey there!
This issue was inactive for a long time and will be automatically closed in 30 days if there isn't any further activity. We therefore assume that the user has lost interest or resolved the problem on their own.
Don't worry though; if this is an error, let us know with a comment and we'll be happy to reopen the issue.
Thanks!
Nothing changed.
My favourite bit about this — besides that automatic closing of issues shouldn't be a thing — is that git blame just says "initial commit" and nobody knows why on Earth those codepoints are even in that list of bad codepoints. That function makes so little sense...
@ralesk some of those functions are to ensure the custom widgets won't render incorrectly due to some nasty character, some of them are to replace characters like server does so tdesktop has valid offsets without re-downloading the sent message. It's unlikely those replacements will ever be revisited given that everyone is afraid to touch that place of tdesktop code (chance of big regressions is too high). You can treat this issue as an architectural one that will likely present all the tdesktop life time.
Hey there!
This issue was inactive for a long time and will be automatically closed in 30 days if there isn't any further activity. We therefore assume that the user has lost interest or resolved the problem on their own.
Don't worry though; if this is an error, let us know with a comment and we'll be happy to reopen the issue.
Thanks!
+
@ilya-fedin I don't think #8140 is related; diacritics aren't getting stripped there, just badly displayed by Qt (and/or the font).
I remember I checked the codepoints between the characters and it was using the ones that are in the lib_ui blacklist
Hey there!
This issue was inactive for a long time and will be automatically closed in 30 days if there isn't any further activity. We therefore assume that the user has lost interest or resolved the problem on their own.
Don't worry though; if this is an error, let us know with a comment and we'll be happy to reopen the issue.
Thanks!
It looks like server is stripping them, I can't send them from Android phone and see them from another Android phone in the received message.
'Te̊st' 'Te̳st' 'Te̿st'
Desktop still brakes diacritics if it made by combined signs In Android version it looks well
Telegram desktop strips (some?) combining diacritics entirely, making it hard to send, for example, complex IPA across.
This is just marginally related to #2651, which was about the text rendering. It may also result in issues when communicating file names from Macs which use decomposed characters at least in the case of accented Latin letters.
Steps to reproduce
Expected behaviour
The message should not be altered and the result should be o̿.
Actual behaviour
The message is altered and the result is o instead.
Configuration
Operating system: Linux, Fedora 29, MATE desktop Version of Telegram Desktop: 1.7.14