Closed GoogleCodeExporter closed 9 years ago
Mmmm, I fear this is going to be a nasty one.
Could you please paste the output of the following commands?
$ python -c "import sys; print(sys.getfilesystemencoding())"
$ echo $LC_ALL
$ echo $LANG
Also, it would be interesting to see how ps represents those commands, so can
you paste the interesting part(s) of ps output as well.
Original comment by g.rodola
on 12 Feb 2014 at 4:50
I finally got around to reporducing the error.
I does not require a non-english linux setup, although it is unlikely to on an
english login.
I did not have an LC_ALL env var, but I had an LC_CTYPE var and LANG.
To reproduce:
unset LC_CTYPE and LANG (they both need to ne unset)
run the attached æøåÅ.sh
run the attached psutil_test.py (while the above is running)
Original comment by l...@hupfeldtit.dk
on 14 Feb 2014 at 11:24
Attachments:
OK, I can reproduce the problem. If the correct encoding is set for the shell
python 2.X returns a bytes string (because file is open in binary mode) while
3.X will report the right cmdline (because text mode is the default):
giampaolo@UX32VD:~/svn/psutil$ python2.7 -c "import psutil;
print(psutil.Process().cmdline())" æøåÅ.sh
['python2.7', '-c', 'import psutil; print(psutil.Process().cmdline())',
'\xc3\xa6\xc3\xb8\xc3\xa5\xc3\x85.sh']
giampaolo@UX32VD:~/svn/psutil$ python3.4 -c "import psutil;
print(psutil.Process().cmdline())" æøåÅ.sh ['python3.4', '-c', 'import
psutil; print(psutil.Process().cmdline())', 'æøåÅ.sh']
If the correct encoding is not set we'll get the same byte string on Python 2.x
and UnicodeEncodeError on Python 3.x.
I'm not sure what's best to do here.
I think we should always open the file in text mode on both Python versions so
that we return the right value.
On the other hand I'm not sure what's best to do in case of encoding errors.
Python provides different options for dealing with them:
http://docs.python.org/3.4/library/functions.html#open
We may choose to use errors='ignore' or errors='replace' although I don't like
imposing such a decision on the users.
Note: other than cmdline() the problem also affects process name() and exe()
methods.
I'll also have to make sure what happens on systems different than Linux.
Original comment by g.rodola
on 15 Feb 2014 at 8:13
FWIW "ps" replaces the invalid characters with "?" which reflects
errors="replace" Python behavior.
Original comment by g.rodola
on 15 Feb 2014 at 8:22
I think the problem is that if the user locale is not setup correctly, the file
is not opened with UTF-8 encoding, even though the proc filesystem is (always?)
UTF-8 encoded on newer Linuxes.
As shown below 'ps', does not work, but 'cat' does and if "encoding='UTF-8'" is
specified in python, then python works as well. I don't think it is correct to
depend on the user locale. What would the interpretation of a proc created by a
user with a different locale be?
------
.. 15686]$ unset LC_CTYPE
.. 15686]$ unset LANG
.. 15686]$ ps auxww | grep 15686
xxx 15686 0.0 0.0 113116 1428 pts/7 S+ 12:30 0:00 /bin/bash
./????????.sh
.. 15686]$ cat cmdline
/bin/bash./æøåÅ.sh
.. 15686]$ python3
Python 3.3.2 (default, Nov 7 2013, 10:01:05)
[GCC 4.8.1 20130814 (Red Hat 4.8.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> with open('cmdline') as ll:
... print(ll.read())
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/lib64/python3.3/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 12:
ordinal not in range(128)
>>> with open('cmdline', encoding='UTF-8') as ll:
... print(ll.read())
...
/bin/bash./æøåÅ.sh
Original comment by l...@hupfeldtit.dk
on 16 Feb 2014 at 12:08
Thanks for sharing this info.
It seems sys.getdefaultencoding() always return 'utf8' no matter what the
current locale is therefore that looks like the way to go on Python 3.
Fixed in revision 42c5b20d7f5b.
Original comment by g.rodola
on 16 Feb 2014 at 1:37
Thank you for providing psutil. I makes system management with python so much
easier.
Original comment by l...@hupfeldtit.dk
on 16 Feb 2014 at 1:45
Glad to hear psutil is useful to you.
Cheers.
Original comment by g.rodola
on 16 Feb 2014 at 4:41
Original comment by g.rodola
on 9 Mar 2014 at 10:26
Closing out as fixed as 2.0.0 version is finally out.
Original comment by g.rodola
on 10 Mar 2014 at 11:36
Original issue reported on code.google.com by
l...@hupfeldtit.dk
on 12 Feb 2014 at 1:00