Closed Tinche closed 2 years ago
@vstinner Do you have any ideas?
I don't know how to do that.
cc @methane
@Tinche Do you really mean timeit
, but not bench_func()
.
bench_func()
receives function. So it may be possible to detect the func is coroutine or not.
On the other hand, timeit
receives an expression, not a function. So it is difficult to detect that coroutine is passed in.
Would you give us some examples?
I use timeit
all the time in the terminal and I've never used bench_func()
, so probably timeit
. Maybe it could be a flag or a different command?
Here's an example. I have a project, https://github.com/Tinche/incant/, that does function composition (mostly for dependency injection), and I want to measure how efficient it is. It supports functions and coroutines. Functions I can benchmark easily, coroutines I need to benchmark using asyncio.run
, and that has a ton of noise since it does a lot of unrelated work.
Note that these coroutines I'm benchmarking are technically async, but they either do not await anything or they await sleep(0)
.
I usually prepare the function being tested in a file and then do something like:
pyperf timeit -g -s "from asyncio import run; from test import main" "run(main())"
so since I need to have a separate file bench_func()
could work too. The CLI interface is sooo nice though ;)
That said, maybe there's a way to run a coroutine without involving an event loop? Just iterate over it until it's done or something like that? I'm not proficient in that part of Python.
Would you try this?
pyperf timeit -g -s "import asyncio; loop=asyncio.get_event_loop(); from test import main" \
"loop.run_until_complete(main())"
or
pyperf timeit -g -s "import asyncio, test" \
"asyncio.get_event_loop().run_until_complete(test.main())"
With this, one loop is used repeatedly instead of creating and destroying loops for each main() execution. Is this reduce your "noise"?
It does work and helps a little. If it's too hard to do otherwise in pyperf I will accept this as the answer ;)
What "little" means? It reduce your noise only little? If so, it means this feature request will have only little benefit.
If you just meant "I don't want to write this timeit", I'm sorry. But it is very difficult. Again, timeit receives statements, not function. So timeit can not distinguish async code automatically.
I will consider about adding bench_async_func()
or bench_func()
supports async func.
And I will consider adding --async
option to timeit later.
Well, it reduces the running time by a lot, so it reduces noise by a lot.
I have a generated coroutine that that I'm benchmarking. This coroutine awaits several other coroutines inside.
asyncio.run
: Mean +- std dev: 529 us +- 42 us
loop.run_until_complete
: Mean +- std dev: 185 us +- 12 us
So the difference was noise introduced by asyncio.run
. Hence, a big improvement. Dunno how much more it can be improved by logic inside pyperf.
Offtopic: heh, for comparison's sake, if I change the test so they are all ordinary functions, not async def functions, it takes 1 microsecond. I wasn't aware asyncio/the event loop adds so much overhead.
So the difference was noise introduced by asyncio.run
Each call to asyncio.run() creates a new fresh event loop, and then closes it. Moreover, it also shutdowns asynchronous generators and the default asyncio executor (thread pool).
I usually prepare the function being tested in a file and then do something like:
Since you write script for test already, I don't think timeit command is so important for you. If #124 is merged, you can just add few lines in your test code:
if __name__ == '__main__':
import pyperf
pyperf.Runner().bench_async_func('main', main)
@methane Thanks a lot! Trying from your branch, now the time is: main: Mean +- std dev: 123 us +- 7 us
. Looks like we got rid of all the overhead.
Fixed by https://github.com/psf/pyperf/pull/124 thanks to @methane.
I closed the issue because it seems like the idea of adding an --async
option to pyperf timeit
was abandonned. But I'm open to this idea if someone wants to write a PR for that!
I was curious and compared doc/examples/bench_async_func.py
between Python 3.6 and 3.10, since the pyperf implementation is different (Python 3.6 doesn't have asyncio.run()
):
$ python3 -m pyperf compare_to py36.json py310.json
Mean +- std dev: [py36] 1.33 ms +- 0.02 ms -> [py310] 1.32 ms +- 0.02 ms: 1.01x faster
Using an asyncio sleep of 1 ms, there is no significant difference: for me, it confirms that the pyperf implementation is correct ;-) The accuracy is good. We don't measure the time spent to create and close the event loop.
(A) Benchmark on asyncio.run() with bench_func() on a coroutine func() which does nothing:
import asyncio
import pyperf
async def func():
pass
def bench():
asyncio.run(func())
runner = pyperf.Runner()
runner.bench_func('bench', bench)
(B) Benchmark on loop.run_until_complete(loop) with bench_func() on a coroutine func() which does nothing:
import asyncio
import pyperf
async def func():
pass
def bench(loop):
loop.run_until_complete(func())
runner = pyperf.Runner()
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
runner.bench_func('bench', bench, loop)
(C) Benchmark pyperf 2.3.1 new bench_async_func() method on a coroutine func() which does nothing:
import asyncio
import pyperf
async def func():
pass
runner = pyperf.Runner()
runner.bench_async_func('bench', func)
Results on Python 3.10:
asyncio_run_py310
=================
bench: Mean +- std dev: 139 us +- 5 us
run_until_complete-py310
========================
bench: Mean +- std dev: 16.7 us +- 0.4 us
bench_async_func-py310
======================
bench: Mean +- std dev: 128 ns +- 2 ns
+-----------+-------------------+--------------------------+-------------------------+
| Benchmark | asyncio_run_py310 | run_until_complete-py310 | bench_async_func-py310 |
+===========+===================+==========================+=========================+
| bench | 139 us | 16.7 us: 8.33x faster | 128 ns: 1087.31x faster |
+-----------+-------------------+--------------------------+-------------------------+
The std dev is way better using bench_async_func()!
+- 5 us
(5000 ns)+- 0.4 us
(400 ns)+- 2 ns
(2 ns)I think essentially pyperf could detect a coroutine was passed in, spawn an event loop and just await it in a loop.
I don't think that detecting if the argument looks like a coroutine or not is not a good idea. It requires to import asyncio
which is a "heavy" module (high startup time). I strongly prefer having a separated API (method) for that.
This function is now part of the just released pyperf 2.3.1.
Hello!
I think
pyperf
is an amazing project and I use thetimeit
to benchmark essentially all the libraries I work on (attrs, cattrs, incant...).I wish I could use it to benchmark async functions though. Right now, I benchmark
asyncio.run(my_coro)
but sinceasyncio.run
is so costly there's a ton of noise in the signal.I think essentially pyperf could detect a coroutine was passed in, spawn an event loop and just await it in a loop.