python-hyper / hyper

HTTP/2 for Python.
http://hyper.rtfd.org/en/latest/
MIT License
1.05k stars 192 forks source link

Performance Tuning #56

Open Lukasa opened 10 years ago

Lukasa commented 10 years ago

hyper is not optimised for performance right now. While the HTTP/2 spec is changing I want to focus on correctness and the ability to easily change behaviour.

However, when it does get nailed down we'll want to make some steps to improve performance. Roberto Peon has provided an awesome list of things to work on, which I've reproduced here. People should take things off this list and break them out into new issues as they go.

Other intelligent performance optimisations should be added here.

Lukasa commented 10 years ago

The first two points should provide a huge performance boost if handled appropriately.

schlamar commented 10 years ago

I just wanted to come here and propose point number 1 after reading your noteboook and then found out that you are already thinking about it :) I can confirm that reading a block into a buffer is a significant improvement on a proprietary TCP protocol with sized framing.

Right now the buffer is a simple string which gets expanded after socket.recv. But there might be more efficient alternatives. Maybe socket.recv_info can be used with a StringIO or something like that. Do you have some ideas for an efficient buffer implementation? Anything that helps with point two would be great. :)

dimaqq commented 10 years ago

On a related note, please be more thorough than time.time(), for example:

if sys.platform.lower().startswith("linux"):
    def _rusage():
        tmp = resource.getrusage(1)  # RUSAGE_THREAD
        return dict(time=time.time(),    # wall
                    utime=tmp.ru_utime,  # user
                    stime=tmp.ru_stime,  # system
                    switch=tmp.ru_nivcsw * 1.,   # cpu contention
                    read=tmp.ru_inblock * 512.,  # disk io
                    write=tmp.ru_oublock * 512.,
                    fault=tmp.ru_majflt * 1.)    # memory contention
Lukasa commented 10 years ago

@dimaqq Good advice, though I was running the notebook on OS X, which would have limited the utility of that function. Might be worth me going back and adding it just for those who run on other platforms though, I'll have a think.

@schlamar I'm not yet sure, I don't know enough about efficient buffers in Python. Note sure that StringIO will work though, because I think the buffer you pass to the recv_into call needs to be a memoryview. You could have a bytearray and a memoryview to it, and use that. This would lead to, as an initial step, a very simple userspace-buffered socket.

More ideally we'd like to be able to use a fixed-size buffer as a ring buffer. I've never seen this done in Python, let-alone in pure Python, and I don't know that it's possible. I'll need to think about how I'd pull it off.

sigmavirus24 commented 10 years ago

@Lukasa do you really want a ring buffer? If I remember correctly, won't it write over old data in the buffer it isn't read quickly enough?

Lukasa commented 10 years ago

The general plan is no. =) socket.recv_into only reads in as much data as there is space in the buffer. It should in principle be possible to have that number be equivalent to the amount of space left in the actual ring buffer.

Of course, that's almost certainly impossible to do without C extensions, which I can't do. So it'll be easier to do as a single linear buffer.

sigmavirus24 commented 10 years ago

For what it's worth, I can think of a way to enforce that with a BytesIO buffer (thanks requests-toolbelt) if you're supposed to be reading bytes. (The same logic would work for a StringIO buffer too)

Lukasa commented 10 years ago

Sadly, I don't think you can use BytesIO with a memoryview, which means I can't use socket.recv_into, which means I can't avoid the overhead of an extra memory copy.

dimaqq commented 10 years ago

guys, I think this socket reading conversation is moot. http/2 is (supposedly) meant to be used over TLS...

Lukasa commented 10 years ago

TLS still uses sockets. =)

Lukasa commented 10 years ago

It's also totally allowed to use HTTP/2 in plaintext. =)

Lukasa commented 10 years ago

The buffered socket idea has been implemented: see this diff.

I've also added some more optimisation ideas, contained in issues #60 and #61.

sigmavirus24 commented 10 years ago

Do we have any benchmarks showing the performance benefits? (Call me a pain if you will ;P)

Lukasa commented 10 years ago

Not yet, but I plan to re-run my big HTTP/1.1 - HTTP/2 comparison at some point to see how the numbers change.

Lukasa commented 10 years ago

In the meantime, pypy's test run speed got a big boost when I fixed the tests up, so anecdotally it 'feels' faster.

sigmavirus24 commented 10 years ago

I didn't doubt it was faster. I was just trying to pre-empt some people from finding this and complaining about lack of benchmarks.

Lukasa commented 10 years ago

Agreed, I'll try to get some numbers. =)

Lukasa commented 10 years ago

Support for nghttp2's HPACK implementation is now present.