Closed larryhastings closed 14 years ago
This patch consists of three changes to CPython:
In detail:
PyStringObject.ob_sval was changed from a char[] array to a char *. This is not in and of itself particularly desirable. It was necessary in order to implement the other two changes.
"lazy concatenations" change string concatenation ("a" + "b") so that, instead of directly calculating the resulting string, it returns a placeholder object representing the result. As a result, string concatenation in CPython is now more than 150% faster on average (as reported by pystone 2.0), and is approximately as fast as the standard string concatenation idiom ("".join([a + b + c])).
"lazy slices" changes string slicing ("abc"[1], "a".strip()) so that, instead of directly calculating the resulting string, it returns a placeholder object representing the result. As a result, string slicing in CPython is now more than 60% faster on average (as reported by pystone 2.0).
When considering this patch, please keep in mind that the "lazy" changes are distinct, and could be incorporated independently. In particular I'm guessing that "lazy concatenations" have a lot higher chance of being accepted than "lazy slices".
These changes were implemented almost entirely in Include/stringobject.h and Objects/stringobject.c.
With this patch applied, trunk builds and passes all expected tests on Win32 and Linux.
For a more thorough discussion of this patch, please see the attached text file(s).
I really like the idea of the lazy cats, and can believe that it's a really good optimisation, but before I review this code properly I'd like to see: a. convincing that it doesn't break strict aliasing (a casual reading suggests it does) b. lazy slices removed into their own patch (or just removed) - I don't want to recommend a patch containing them c. adherence to coding standard d. a little more explanation of how the cat objects work: it's important because they're a future minefield of bugs.
Howdy! Much has transpired since I posted this patch.
Is there life after Guido patch-closing? I'd be happy to spend the time answering your questions if my patch had some sort of future. (Though you'll have to tell me what you mean by "break strict aliasing".)
Hi Larry, It doesn't sound too promising - I'm new and have no powers of resurrection :(
By strict aliasing, I just meant it's illegal to access members of one type if the object is of a different (incompatible) type (actually I was wrong, this isn't the strict aliasing rule - it's a more fundamental one). In your case, it means it's illegal to pass a concat object where a string object is expected, even if the function accesses members that are common to them both. If this is happening, the answer is to make a union with the string object and cat object as members, and to use this union type instead but it's not pretty.
I suggest this patch is closed anyway. If you still believe in your code and think that lazy string cats have support, I suggest making a new patch with just those in (fixed up to be correct C, and PEP-7 compliant).
I'm raising the level to draw more attention to this featue. It should either be considered for inclusion or closed.
ISTM that 2.x won't receive this kind on enhancement anymore.
Collin, I'm adding you to the nosy list because you may be interested in having this (either on unladen or CPython). If so, also take a look at bpo-1569040.
Either this bug or bpo-1569040 should be closed as duplicate of the other (it's really the same approach by the same author at two different times :-)).
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['interpreter-core', 'type-feature']
title = 'The "lazy strings" patch'
updated_at =
user = 'https://github.com/larryhastings'
```
bugs.python.org fields:
```python
activity =
actor = 'eric.araujo'
assignee = 'none'
closed = True
closed_date =
closer = 'eric.araujo'
components = ['Interpreter Core']
creation =
creator = 'larry'
dependencies = []
files = ['7603', '7604']
hgrepos = []
issue_num = 1590352
keywords = ['patch']
message_count = 7.0
messages = ['51321', '51322', '51323', '51324', '58745', '84583', '84588']
nosy_count = 6.0
nosy_names = ['collinwinter', 'pitrou', 'larry', 'christian.heimes', 'ajaksu2', 'paulhankin']
pr_nums = []
priority = 'high'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = '1569040'
type = 'enhancement'
url = 'https://bugs.python.org/issue1590352'
versions = ['Python 2.7']
```