Change replacing backref to only replace part with empty string (I don't understand why it was done that way in the first place)
Use non-capturing groups for another 10-20% boost
In [22]: _RE_FOLDING_WHITE_SPACES = re.compile(r"(\n\r?|\r\n?)(\s*)")
In [23]: def unfold(value):
....: """
....: Unfolding is accomplished by simply removing any CRLF
....: that is immediately followed by WSP. Each header field should be
....: treated in its unfolded form for further syntactic and semantic
....: evaluation.
....: """
....: return re.sub(_RE_FOLDING_WHITE_SPACES, r'\2', value)
....:
In [24]: %timeit unfold(x)
100000 loops, best of 3: 9.59 µs per loop
In [26]: _RE_FOLDING_WHITE_SPACES = re.compile(r"(?:\n\r?|\r\n?)")
In [27]: def unfold(value):
....: """
....: Unfolding is accomplished by simply removing any CRLF
....: that is immediately followed by WSP. Each header field should be
....: treated in its unfolded form for further syntactic and semantic
....: evaluation.
....: """
....: return _RE_FOLDING_WHITE_SPACES.sub('', value)
....:
In [29]: %timeit unfold(x)
1000000 loops, best of 3: 598 ns per loop
15x faster unfold function:
re.sub
to skip_compile
. See: https://stackoverflow.com/a/47477439/3315725In [23]: def unfold(value): ....: """ ....: Unfolding is accomplished by simply removing any CRLF ....: that is immediately followed by WSP. Each header field should be ....: treated in its unfolded form for further syntactic and semantic ....: evaluation. ....: """ ....: return re.sub(_RE_FOLDING_WHITE_SPACES, r'\2', value) ....:
In [24]: %timeit unfold(x) 100000 loops, best of 3: 9.59 µs per loop
In [26]: _RE_FOLDING_WHITE_SPACES = re.compile(r"(?:\n\r?|\r\n?)")
In [27]: def unfold(value): ....: """ ....: Unfolding is accomplished by simply removing any CRLF ....: that is immediately followed by WSP. Each header field should be ....: treated in its unfolded form for further syntactic and semantic ....: evaluation. ....: """ ....: return _RE_FOLDING_WHITE_SPACES.sub('', value) ....:
In [29]: %timeit unfold(x) 1000000 loops, best of 3: 598 ns per loop