Open ChristianS99 opened 2 months ago
This is definitely a qt issue. There isn't much we can do about this. Obviously easy fix is to not use non-ascii characters in your file name or make sure your LANG is set.
Turns out, this is a Qt issue, but the problem is in fact a bit more complex.
"Passw?rter.kdbx" means a Latin1 string is being interpreted as UTF-8 here. However, simply running
LANG= keepassxc-cli ls Passwörter.kdbx
isn't a good problem demonstration. keepassxc-cli
will use a default locale, but the parameters are in whatever your terminal's own input encoding is, so this could be anything.
The following Python snippet is a more stable test case:
import subprocess
subprocess.Popen([b'keepassxc-cli', b'ls', 'Passwörter.kdbx'.encode('iso-8859-1')],
env={'LANG': ''}).wait()
This is where we parse the command line arguments: https://github.com/keepassxreboot/keepassxc/blob/develop/src/cli/keepassxc-cli.cpp#L194
I believe that
for (int i = 0; i < argc; ++i) {
arguments << QString(argv[i]);
}
is indeed wrong. This should at least be QString::fromLocal8Bit(argv[i])
, but according to the docs, this is equivalent to QString::fromUtf8()
on Linux, which is obviously wrong when your system locale isn't Unicode-based.
I think defaulting to UTF-8 is a sane assumption for most Linux systems, but if your input encoding is something else, this will obviously fail. To fix this, we'd need to parse LANG
or LC_ALL
ourselves, but even with those variables set, we could only guess what the actual input encoding of the command line parameters is.
Turns out, this is a Qt issue, but the problem is in fact a bit more complex.
"Passw?rter.kdbx" means a Latin1 string is being interpreted as UTF-8 here. However, simply running
yeah, agree. looks, like output is latin1, where the terminal expects utf8
LANG= keepassxc-cli ls Passwörter.kdbx
isn't a good problem demonstration.
keepassxc-cli
will use a default locale, but the parameters are in whatever your terminal's own input encoding is, so this could be anything.
terminal's input encoding is utf8, even with LANG="" as setting this in front of command only change it for the command, not the terminal
The following Python snippet is a more stable test case:
import subprocess subprocess.Popen([b'keepassxc-cli', b'ls', 'Passwörter.kdbx'.encode('iso-8859-1')], env={'LANG': ''}).wait()
This is where we parse the command line arguments: https://github.com/keepassxreboot/keepassxc/blob/develop/src/cli/keepassxc-cli.cpp#L194
I believe that
for (int i = 0; i < argc; ++i) { arguments << QString(argv[i]); }
is indeed wrong. This should at least be
QString::fromLocal8Bit(argv[i])
, but according to the docs, this is equivalent toQString::fromUtf8()
on Linux, which is obviously wrong when your system locale isn't Unicode-based.
mind, this is the qt6 docs, qt5 is different, and actually fromLocal8Bit and fromUtf8 do different things.
I think defaulting to UTF-8 is a sane assumption for most Linux systems, but if your input encoding is something else, this will obviously fail. To fix this, we'd need to parse
LANG
orLC_ALL
ourselves, but even with those variables set, we could only guess what the actual input encoding of the command line parameters is.
#include <QString>
#include <QTextStream>
#include <iostream>
int main(int argc, char *argv[])
{
if (argc > 1) {
QTextStream out(stdout);
std::cout << argv[1] << std::endl;
int p = 0;
while (argv[1][p] != 0) {
printf("%x ", (unsigned char)argv[1][p]);
p+=1;
}
printf("\n");
QString s1 = QString(argv[1]);
out << s1 << Qt::endl;
QString s2 = QString::fromLocal8Bit(argv[1]);
out << s2 << Qt::endl;
QString s3 = QString::fromUtf8(argv[1]);
out << s3 << Qt::endl;
}
}
small test program to try a few things. running this program with LANG= ./qttest aäböc
gives this output:
aäböc
61 c3 a4 62 c3 b6 63
a?b?c
a??b??c
a?b?c
line 1: terminal is consistent with encoding of input given and output expected line 2: the encoding actualy is utf8 line3: Qstring obviously converts the byte sequence somehow. line 4 and 5: fromLocal8Bit and fromUtf8 are different (on qt5)
yeah, agree. looks, like output is latin1, where the terminal expects utf8
This is not just the terminal output, but first and foremost the file name. File names are always UTF-8 on Linux, so using a Latin1 string is wrong in any case.
When I QDebug my QLocale, it always says "Latin1", even when it's actually UTF-8. I also couldn't find any difference in behaviour between QString(argv[i])
and QString::fromLocal8Bit(argv[i])
. However, looking at the Qt source code for QCommandlineParser::process(&QCoreApplication)
, I figure that QString::fromLocal8Bit(argv[i])
is indeed the correct way.
Christian, try to execute the executable like this
LANG=de_DE.UTF-8 executable args
Tell me if it works, I had the same problem in Qt and fixed it by using LANG=C in another situation. Good luck mate
LANG=de_DE.UTF-8 executable args
This works, and it is my default. It just stumbled over the problem by accident, and thought I ccould report it.
Overview
When LANG is set to "" file "Passwörter.kdbx" cannot be opened
Steps to Reproduce
Expected Behavior
file can be opened
Context
Not 100% sure, but to my experience, LANG should only affect the used output lanugage, and should not affect the encoding that is used for interpreting given arguments, eg.:
GUI is affected in same way