Closed 25a81009-9caa-4147-a13b-b4843223656b closed 14 years ago
In the Python 2.x branch, os.path.normpath will sometimes return a str even if given a unicode. This is not an issue in the Python 3.0 branch.
This happens specifically when it throws away all string data and constructs its own:
>>> os.path.normpath(u'')
'.'
>>> os.path.normpath(u'.')
'.'
>>> os.path.normpath(u'/')
'/'
This is a problem if working with code which expects all strings to be unicode strings (sometimes, functions raise exceptions if given a str, when expecting a unicode).
I have attached patches (with test cases) for posixpath and ntpath which correctly preserve the unicode-ness of the input string, such that the new behaviour is:
>>> os.path.normpath(u'')
u'.'
>>> os.path.normpath(u'.')
u'.'
>>> os.path.normpath(u'/')
u'/'
I tried it on os2emxpath and plat-riscos/riscospath (the other two OS-specific path modules), and it already worked fine for them. Therefore, this patch fixes all necessary OS-specific versions of os.path.
Thanks for the patch, I tried it on Linux and it seems to solve the problem.
A few comments about it: 1) I'd change all the self.assertEqual(type(posixpath.normpath(u"")), unicode) to self.assertTrue(isinstance(posixpath.normpath(u""), unicode)); 2) a test for normpath(u'.') should probably be added; 3) in ntpath.py the 'slash' is actually a backslash, so the name of the var should be changed;
Thanks Ezio.
I've updated the patch to incorporate your suggestions.
Note that I too have only tested it on Linux, but I tested both posixpath and ntpath (and there is no OS-specific code, except for the filenames themselves).
I'm not sure if using assertTrue(isinstance ...) is better than assertEqual(type ...), because the type equality checking produces this error: AssertionError: \<type 'str'> != \<type 'unicode'> while isinstance produces this unhelpful error: AssertionError: False is not True
But oh well, I made the change anyway as most test cases use isinstance.
assertTrue() also accepts a 'msg' argument where to explain what went wrong in case of failure [1].
[1]: http://docs.python.org/library/unittest.html#unittest.TestCase.assertTrue
Also, assertTrue has an alias failUnless which I personally find more descriptive (I don't know if either form is preferred for inclusion in Python though).
failUnless is deprecated in Python3.1 [1]. The assert methods are preferred over the fail ones that are now deprecated.
[1]: http://docs.python.org/3.1/library/unittest.html#unittest.TestCase.failUnless
Fixed in r77442 (trunk) and r77443 (release26-maint), thanks!
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/ezio-melotti' closed_at =
created_at =
labels = ['type-bug', 'library', 'expert-unicode']
title = "os.path.normpath doesn't preserve unicode"
updated_at =
user = 'https://bugs.python.org/mgiuca'
```
bugs.python.org fields:
```python
activity =
actor = 'ezio.melotti'
assignee = 'ezio.melotti'
closed = True
closed_date =
closer = 'ezio.melotti'
components = ['Library (Lib)', 'Unicode']
creation =
creator = 'mgiuca'
dependencies = []
files = ['13757', '15362']
hgrepos = []
issue_num = 5827
keywords = ['patch']
message_count = 7.0
messages = ['86395', '95223', '95459', '95490', '95493', '95495', '97621']
nosy_count = 5.0
nosy_names = ['loewis', 'kcwu', 'ezio.melotti', 'mgiuca', 'sandberg']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue5827'
versions = ['Python 2.6', 'Python 2.7']
```