Closed 50eff062-408a-4098-b1b2-8222303b9d0c closed 23 years ago
Marc-Andre, please review this & decide what should happen next.
This version of the patch is clearly bogus. In UTF-16 encodings, \n can occur whenever the low or high byte of a Unicode character is 0x0A. I don't know if Unicode is designed to avoid all such code positions but I can hardly believe it.
A correct readline() method would have to read 2 bytes at a time and check for u"\u000A". (I don't care for all the other Unicode line breaking characters, those are for a different application level presumably.)
I'm not sure whether this is the right fix: Unicode defines many more line break characters than just LF and the patch will only work correctly on Unix (also note that UTF-16 can be BE and LE -- your fix assumes LE).
A true fix would have to also touch the .read() method and implement a true read-ahead buffer strategy to get this done right.
Postponed until after the Python 2.0b2 release.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/malemburg' closed_at =
created_at =
labels = ['library']
title = 'Fixes of ReadStream.readline() in UTF-16 and -LE codecs'
updated_at =
user = 'https://bugs.python.org/anonymous'
```
bugs.python.org fields:
```python
activity =
actor = 'gvanrossum'
assignee = 'lemburg'
closed = True
closed_date = None
closer = None
components = ['Library (Lib)']
creation =
creator = 'anonymous'
dependencies = []
files = ['2792']
hgrepos = []
issue_num = 401477
keywords = ['patch']
message_count = 5.0
messages = ['34282', '34283', '34284', '34285', '34286']
nosy_count = 3.0
nosy_names = ['lemburg', 'gvanrossum', 'fdrake']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue401477'
versions = []
```