Adapted tinytomp parser in stompest

nikipore commented 8 years ago

"[...] Alternatively it may be turned into a pull request for Stompest. [...]"

I actually "pulled" your parser implementation myself, with a resulting speedup of ~50 for my test case. Please have a look at my implementation. It's not refactored yet but passes all the corner cases of my StompParser tests.

nikipore commented 8 years ago

I'll have a 2nd look because when adapting your algorithm I noticed that it is N^2 in frame size/packet size, namely the data.find(eof) step for (large) frames which repeated every time when a new (small) packet arrives. This can be easily healed by keeping track of the read position, turning this quadratic into a linear behavior.

dw commented 8 years ago

Sounds good to me :) Really glad to see you're taking this up, I was too lazy to ever get around to it. Note most of the speedups are not due to algorithmic improvements (actually, your original version was much better for that), but merely shunting most work off into Python's C code. It is often the case that a slow crappy algorithm that exploits native string functions will beat a fast pure-Python algorithm doing the same thing

nikipore commented 8 years ago

Done and pushed to PyPI as stompest 2.2.3. One change produced another tremendous speedup, namely using a bytearray instead of bytes as FIFO buffer which can be right-extended and left-truncated very cheaply.

dw commented 8 years ago

Awesome! Will see about importing this back into the project I was using it for. All the best :)

nikipore commented 8 years ago

After some serious tweaking I arrived at stompest 2.2.6, and I am still behind tinystomp by a factor of ~1.7-2.5 (not by an order of magnitude any more, but still quite a bit).

1000 loops, best of 3: 1.48 msec per loop # tinystomp
100 loops, best of 3: 2.55 msec per loop (Python 3.4, STOMP 1.0)
100 loops, best of 3: 2.63 msec per loop (Python 3.4, STOMP 1.2)
100 loops, best of 3: 2.82 msec per loop (Python 3.4, STOMP 1.2)
100 loops, best of 3: 2.87 msec per loop # stompest (Python 2.7, STOMP 1.0)
100 loops, best of 3: 3.08 msec per loop # stompest (Python 2.7, STOMP 1.1)
100 loops, best of 3: 3.3 msec per loop # stompest (Python 2.7, STOMP 1.2)

where I've pushed 50 times the following into the parser:

frameBytes = b'CONNECT\npasscode:123\nlogin:123\naccept-version:1.0,1.1,1.2\nhost:localhost\n\n\x00\n\n\n\n'

One can see that the newer (more feature-rich) protocol versions are slower. I believe that I've now tweaked pretty much everything which doesn't

break STOMP protocol compliance (which tinystomp deliberately does by not implementing protocol features such as heart-beating, UTF-8 headers, raw headers, optional carriage returns, escaping of the header delimiters, ...),
break or Python 2/3 interoperability, or
obfuscate the code too much.

I'd expect stompest (tinystomp) to scale better with large bodies (many headers). Concluding, there is still a raison d'être for tinystompest if all you need is speed, ASCII headers, basic STOMP, and no Python 3, support.

dw / tinystomp

Adapted tinytomp parser in stompest #1