Open tiran opened 9 years ago
Coverity has found a potential buffer overflow in the unicodedata module. The function call _getcode() which calls _cmpname(). _cmpname() copies data into fixed size buffer of length NAME_MAXLEN. Neither lookup() nor _getcode() limit name_length to NAME_MAXLEN. Therefore the buffer could theoretical overflow.
In practice the buffer overflow can't be abused because Tools/unicode/makeunicodedata.py already limits max name length. I still like to fix the bug because it is a low hanging fruit. In most versions of Python the code already checks that name_length fits in INT_MAX.
CID 1295028 (#1 of 1): Out-of-bounds access (OVERRUN) overrun-call: Overrunning callee's array of size 256 by passing argument (int)name_length (which evaluates to 2147483647) in call to _getcode
For now the error message virtually always contains the name (unless the length of its UTF-8 representation > INT_MAX). With unicode_name_maxlen.patch it doesn't contains the name of length few hundreds or tens characters.
Proposed patch makes the error message always contain the name, but truncated to NAME_MAXLEN bytes.
>>> name = ''.join(map(chr, range(0x2c80, 0x2ce4)))
>>> unicodedata.lookup(name)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: "undefined character name 'ⲀⲁⲂⲃⲄⲅⲆⲇⲈⲉⲊⲋⲌⲍⲎⲏⲐⲑⲒⲓⲔⲕⲖⲗⲘⲙⲚⲛⲜⲝⲞⲟⲠⲡⲢⲣⲤⲥⲦⲧⲨⲩⲪⲫⲬⲭⲮⲯⲰⲱⲲⲳⲴⲵⲶⲷⲸⲹⲺⲻⲼⲽⲾⲿⳀⳁⳂⳃⳄⳅⳆⳇⳈⳉⳊⳋⳌⳍⳎⳏⳐⳑⳒⳓⳔ�...'"
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['extension-modules', 'type-bug']
title = 'unicodedata_UCD_lookup() has theoretical buffer overflow'
updated_at =
user = 'https://github.com/tiran'
```
bugs.python.org fields:
```python
activity =
actor = 'serhiy.storchaka'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Extension Modules']
creation =
creator = 'christian.heimes'
dependencies = []
files = ['39109', '41365']
hgrepos = []
issue_num = 23997
keywords = ['patch']
message_count = 2.0
messages = ['241461', '256744']
nosy_count = 7.0
nosy_names = ['lemburg', 'pitrou', 'vstinner', 'christian.heimes', 'benjamin.peterson', 'ezio.melotti', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue23997'
versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']
```