Open mdickinson opened 11 years ago
[Broken out of the discussion in bpo-15144]
Some of the newly-optimized code in Objects/unicodeobject.c contains strict aliasing violations; under the C standards, this is undefined behaviour (C99 6.5p7).
An example occurs in ascii_decode:
unsigned long value = *(const unsigned long *) _p;
Here the pointer dereference violates the strict aliasing rule.
I think these portions of Objects/unicodeobject.c should be rewritten to avoid the undefined behaviour.
This is not a purely theoretical problem: compilers are known to make optimizations based on the assumption that strict aliasing is not violated. Early versions of David Gay's dtoa.c gave incorrect results as a result of strict aliasing violations, for example; see [1].
[2] gives a stackoverflow reference explaining strict aliasing.
[1] http://patrakov.blogspot.co.uk/2009/03/dont-use-old-dtoac.html [2] http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['interpreter-core', 'type-bug']
title = 'Strict aliasing violations in Objects/unicodeobject.c'
updated_at =
user = 'https://github.com/mdickinson'
```
bugs.python.org fields:
```python
activity =
actor = 'BreamoreBoy'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation =
creator = 'mark.dickinson'
dependencies = []
files = []
hgrepos = []
issue_num = 15992
keywords = []
message_count = 1.0
messages = ['170841']
nosy_count = 5.0
nosy_names = ['jcea', 'mark.dickinson', 'christian.heimes', 'serhiy.storchaka', 'pconnell']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue15992'
versions = ['Python 3.4', 'Python 3.5']
```