Closed pali closed 5 months ago
Have you found a real source of the problem? I faced something like this with older Qt versions but never with 4.8.7 if I remember correctly.
See reported issue https://github.com/psi-im/iris/issues/42 with my investigation
Well this fix for me looks like a workaround. Maybe well-implemented workaround but still workaround.
I bet here is problem: http://code.qt.io/cgit/qt/qt.git/tree/src/xml/dom/qdom.cpp#n4217
whole while loop is iterating over UTF-16 codepoints. But Unicode characters above U+FFFF are represented by two surrogate pairs in UTF-16 and above code try to encode them separately.
Such thing can work only for UTF-16, not for UTF-8 (what is I believe used in iris). I can image that first check for canEncode in UTF-8 can return true (as expect UTF-16 surrogate pairs), but second call return false and so we have that low surrogate encoded in XML &#x????; representation...
It's still interesting if it's reproducible with Qt-4.8.7. I understand the link above points to recent sources but anyway.
This is just prevention to not send any "malformed" or "incorrect" data via network to XMPP server.
I do not have 4.8.7 (just 4.8.2), but you can try it yourself with character 😀 (U+1F600 GRINNING FACE). On 4.8.2 it is definitely problem.
That XML bug is reported in Qt bugtracker: https://bugreports.qt.io/browse/QTBUG-25291
Volker Götz in first comment said exactly what I understand from that Qt source code.
So... do you think that it is useful to have this check for valid XML in sanitize function?
I'm not sure. I'd avoid any overhead where possible. And if it's not that CPU consuming it still takes time to support this code.
Probably it's better to keep the code unchanged to understand where real problems happen.
@pali thank you very much for the investigation :) your second fix worked for years in Psi without any issue. Btw, for some reason with Qt6 it seems to be not needed. Maybe they changed their defaults.
In the docs I see
By default, UTF-8 is used for reading and writing
but in same doc something about locale-aware. But we anyway disabled the setCodec() trick for Qt6+
Make sure that we send only valid XML sequence to XMPP server. Otherwise server can disconnect us.