python / asyncio

asyncio historical repository
https://docs.python.org/3/library/asyncio.html
1.04k stars 177 forks source link

stdin.drain is very slow on some platforms #384

Closed arthurdarcet closed 8 years ago

arthurdarcet commented 8 years ago

I'm on Python 3.5.2, with the following code:

#!/usr/bin/env python3

import asyncio
import time
import logging

logging.basicConfig(level='DEBUG')

async def drain_time(length):
    proc = await asyncio.create_subprocess_exec('/bin/cat',
        stdin=asyncio.subprocess.PIPE,
        stdout=asyncio.subprocess.DEVNULL,
        stderr=asyncio.subprocess.DEVNULL,
    )

    data = b'0' * length
    proc.stdin.write(data)
    t0 = time.time()
    await proc.stdin.drain()
    r = time.time() - t0
    proc.stdin.close()
    await proc.wait()
    return r

async def main():
    for i in range(5, 17):
        for j in range(1, 2):
            size = j * 10**i
            print('{:.0e}'.format(size), await drain_time(size))

asyncio.get_event_loop().run_until_complete(main())

on an Ubuntu server (this seems ok)

DEBUG:asyncio:Using selector: EpollSelector
1e+05 7.3909759521484375e-06
2e+05 6.4373016357421875e-06
3e+05 0.000118255615234375
1e+06 0.0002932548522949219
2e+06 0.0005526542663574219
3e+06 0.0019674301147460938
1e+07 0.023283720016479492
2e+07 0.004916667938232422
3e+07 0.16347408294677734
1e+08 0.024138450622558594
2e+08 0.04891180992126465
3e+08 0.07430100440979004

on an Archlinux: (more or less ok)

DEBUG:asyncio:Using selector: EpollSelector
1e+05 5.245208740234375e-06
2e+05 4.291534423828125e-06
3e+05 8.797645568847656e-05
1e+06 0.0005102157592773438
2e+06 0.0015668869018554688
3e+06 0.0022423267364501953
1e+07 0.01090097427368164
2e+07 0.1207735538482666
3e+07 0.2826368808746338
1e+08 0.8428807258605957
2e+08 0.029124975204467773
3e+08 3.3120124340057373
1e+09 2.904399871826172
2e+09 13.772144079208374
3e+09 16.17424488067627

on osx: (not ok)

DEBUG:asyncio:Using selector: KqueueSelector
1e+05 0.0005750656127929688
2e+05 0.0010769367218017578
3e+05 0.005205869674682617
1e+06 0.007627964019775391
2e+06 0.022066831588745117
3e+06 0.05894899368286133
1e+07 1.633073091506958
2e+07 6.287441968917847
3e+07 15.285433053970337
1e+08 178.50560998916626

Any idea what could be causing this?

Martiusweb commented 8 years ago

Can you use cProfile around the run_until_complete() so we can get more details?

arthurdarcet commented 8 years ago

does that work ? http://arthur.darcet.fr/a.cprof

methane commented 8 years ago

https://github.com/python/asyncio/blob/master/asyncio/unix_events.py#L523

        data = b''.join(self._buffer)
        assert data, 'Data should not be empty'

        self._buffer.clear()
...
            elif n > 0:
                data = data[n:]

            self._buffer.append(data)  # Try again later.

buffer implementation of UnixWritePipeTransport is not good at large data. macOS has very little pipe buffer compared to linux. It cause more performance penalty.

methane commented 8 years ago

Using bytearray as FIFO buffer will fix it. But #339 will fix it too.

arthurdarcet commented 8 years ago

thanks, the PR works: this is what i get with #339 on OSX:

1e+05 0.0004410743713378906
2e+05 0.0007159709930419922
3e+05 0.0011639595031738281
1e+06 0.003136873245239258
2e+06 0.005909919738769531
3e+06 0.009123086929321289
1e+07 0.03362894058227539
2e+07 0.06357192993164062
3e+07 0.09158802032470703
1e+08 0.27935194969177246
2e+08 0.6114847660064697
3e+08 0.8901691436767578
1e+09 2.8768999576568604