Open sirex opened 4 years ago
@sirex Sorry for responding so late, but this issue is now potentially fixable. Can you reproduce this issue with line profiler 3.2.1, and what OS are you on? I suspect this may be an issue with the C-extension part of line profiler on Windows, if that's what you're using, because I'm able to profile any non-Cython-compiled async function on Python 3.8.8 and Python 3.9.4 (and possibly lower versions, I have yet to try) on Ubuntu 20.04.
I have tested it and now it seems to be working:
import asyncio
@profile
async def test():
for i in range(2):
await asyncio.sleep(1)
asyncio.run(test())
$ lsb_release -drc
Description: Manjaro Linux
Release: 21.0.2
Codename: Ornara
$ kernprof --version
3.2.1
$ kernprof -lv test.py
Wrote profile results to test.py.lprof
Timer unit: 1e-06 s
Total time: 8.2e-05 s
File: test.py
Function: test at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 async def test():
5 3 9.0 3.0 11.0 for i in range(2):
6 2 73.0 36.5 89.0 await asyncio.sleep(1)
It's again not working with 3.4.0, but works with 3.3.0.
I used the same code example from @sirex
% kernprof -V
3.4.0
% kernprof -l -v test.py
Wrote profile results to test.py.lprof
Timer unit: 1e-06 s
Total time: 0 s
File: test.py
Function: test at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 async def test():
5 for i in range(2):
6 await asyncio.sleep(1)
VS
% kernprof -V
3.3.0
% kernprof -v -l test.py
Wrote profile results to test.py.lprof
Timer unit: 1e-06 s
Total time: 7.7e-05 s
File: test.py
Function: test at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 async def test():
5 3 9.0 3.0 11.7 for i in range(2):
6 2 68.0 34.0 88.3 await asyncio.sleep(1)
Doesn't seem to be python version related (tested on 3.8.0, 3.9.0, 3.10.0). MacOS 12.2.1
I suspect the issue came with https://github.com/pyutils/line_profiler/commit/4e2744e64afbaf4d882dabd7e9741790b0bc0f68#diff-2c5d18d29e15ee10fd416d3654749cb19a5e2516a5f391551d229cf80d7bac13R29. Specifically, the lines should probably be
import inspect
def is_coroutine(f):
return inspect.iscoroutinefunction(f)
rather than
def is_coroutine(f):
return False
I haven't tested it yet though, so that may not be the problem.
Edit: It seems that all coroutine handling was removed in this commit??? Not sure why @Erotemic removed it, I assume because the variable names referenced 3.5, even though the code was actually needed for coroutines in Python >= 3.5
In pyhton3.7 of windows10, the async timing also does not take effect。 environment: windows10,Python 3.7.7 :
code:
import asyncio
@profile
async def test():
for i in range(2):
await asyncio.sleep(1)
asyncio.run(test())
result:
Timer unit: 1e-06 s
Total time: 0 s
File: service.py
Function: test at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 @profile
5 async def test():
6 for i in range(2):
7 await asyncio.sleep(1)
This method can solve this problem: modify:line_profiler.py
from
def is_coroutine(f):
return False
change to
import inspect
def is_coroutine(f):
return inspect.iscoroutinefunction(f)
add this:
def wrap_coroutine(self, func):
"""
Wrap a Python 3.5 coroutine to profile it.
"""
@functools.wraps(func)
async def wrapper(*args, **kwds):
self.enable_by_count()
try:
result = await func(*args, **kwds)
finally:
self.disable_by_count()
return result
return wrapper
Not sure how I missed this thread. @CaptainDP thank's for the PR. I'll get that merged.
I've released 3.5.1, which contains the patch from @CaptainDP. Can someone verify that this issue is resolved?
As far as I can tell, the patch in 3.5.1 seems to work well. The following code yields some plausible output:
import asyncio
from line_profiler import LineProfiler
profile = LineProfiler()
@profile
async def test():
for _ in range(100):
await asyncio.sleep(0.01)
asyncio.run(test())
profile.print_stats()
Timer unit: 1e-06 s
Total time: 0.006042 s
File: /Users/falko/Projects/rosys/./main.py
Function: test at line 44
Line # Hits Time Per Hit % Time Line Contents
==============================================================
44 @profile
45 async def test():
46 101 343.0 3.4 5.7 for _ in range(100):
47 100 5699.0 57.0 94.3 await asyncio.sleep(0.01)
@falkoschindler Shouldn't Total time
be equal to something around 1 second ? I think something is still going wrong with the awaited functions
@fonfonx, I've confirmed this is a problem. The following code runs an async and sync variants of @falkoschindler's test:
class Timer:
def __init__(self):
import time
self.counter = time.perf_counter
def __enter__(self):
self.start = self.counter()
return self
def __exit__(self, a, b, c):
self.elapsed = self.counter() - self.start
def test_async_profile():
import asyncio
import time
from line_profiler import LineProfiler
n = 100
m = 0.01
async def async_function():
for idx in range(n):
await asyncio.sleep(m)
with Timer() as t:
asyncio.run(async_function())
time1 = t.elapsed
profile = LineProfiler()
profiled_async_function = profile(async_function)
with Timer() as t:
asyncio.run(profiled_async_function())
time2 = t.elapsed
profile.print_stats()
ideal_time = n * m
max_time = max(time2, time1)
min_time = min(time2, time1)
ratio = max_time / min_time
error = abs(max_time - ideal_time)
assert ratio < 1.5, 'profiled function should run about as fast'
assert error < (ideal_time * 0.5), 'should be somewhat close to the ideal time'
lstats = profile.get_stats()
unit = lstats.unit
stats = lstats.timings
profiled_items = sorted(stats.items())
assert len(profiled_items) == 1
for (fn, lineno, name), timings in profiled_items:
total_time = 0.0
for lineno, nhits, time_ in timings:
total_time += time_
print(f'async ideal_time={ideal_time}')
print(f'async time1={time1}')
print(f'async time2={time2}')
print(f'async unit={unit}')
print(f'async error={error}')
print(f'async ratio={ratio}')
print(f'async total_time={total_time}')
# --- similar test with sync
def sync_function():
for idx in range(n):
time.sleep(m)
with Timer() as t:
sync_function()
time1 = t.elapsed
profile = LineProfiler()
profiled_sync_function = profile(sync_function)
with Timer() as t:
profiled_sync_function()
time2 = t.elapsed
profile.print_stats()
ideal_time = n * m
max_time = max(time2, time1)
min_time = min(time2, time1)
ratio = max_time / min_time
error = abs(max_time - ideal_time)
assert ratio < 1.5, 'profiled function should run about as fast'
assert error < (ideal_time * 0.5), 'should be somewhat close to the ideal time'
lstats = profile.get_stats()
unit = lstats.unit
stats = lstats.timings
profiled_items = sorted(stats.items())
assert len(profiled_items) == 1
for (fn, lineno, name), timings in profiled_items:
total_time = 0.0
for lineno, nhits, time_ in timings:
total_time += time_
print(f'sync ideal_time={ideal_time}')
print(f'sync time1={time1}')
print(f'sync time2={time2}')
print(f'sync unit={unit}')
print(f'sync error={error}')
print(f'sync ratio={ratio}')
print(f'sync total_time={total_time}')
The output is correct of the sync variant, but incorrect for the async variant in terms of total time:
Total time: 0.002221 s
File: /home/joncrall/code/line_profiler/tests/test_async.py
Function: async_function at line 23
Line # Hits Time Per Hit % Time Line Contents
==============================================================
23 async def async_function():
24 101 116.0 1.1 5.2 for idx in range(n):
25 100 2105.0 21.1 94.8 await asyncio.sleep(m)
async ideal_time=1.0
async time1=1.0108539490029216
async time2=1.0161754119908437
async unit=1e-06
async error=0.016175411990843713
async ratio=1.0052643242806452
async total_time=2221.0
Timer unit: 1e-06 s
Total time: 1.00602 s
File: /home/joncrall/code/line_profiler/tests/test_async.py
Function: sync_function at line 66
Line # Hits Time Per Hit % Time Line Contents
==============================================================
66 def sync_function():
67 101 175.0 1.7 0.0 for idx in range(n):
68 100 1005843.0 10058.4 100.0 time.sleep(m)
sync ideal_time=1.0
sync time1=1.005714118015021
sync time2=1.006154617993161
sync unit=1e-06
sync error=0.006154617993161082
sync ratio=1.0004379972103896
sync total_time=1006018.0
I don't think line_profiler is measuring the await time correctly.
I think the profiler shouldn't include the time spent in an asyncio.sleep
call, since the control is handed over to asyncio which might run other tasks in the meantime. The test()
function is basically on hold and doesn't consume resources.
@falkoschindler, that was my initial thought, but when I modified the async function to time how long the await as taking explicitly:
async def async_function():
for idx in range(n):
with Timer() as t2:
await asyncio.sleep(m)
print(f'{t2.elapsed=}')
I got a bunch of this:
t2.elapsed=0.010150016983971
t2.elapsed=0.010166646970901638
t2.elapsed=0.010175412986427546
t2.elapsed=0.010167814965825528
Which indicates to me that the await is actually consuming 0.1s of time, and I think line_profiler should probably measure how long it needs to wait when an await
is called.
... but then again maybe not. According to the docs await suspends execution of the coroutine. I don't often use the async functions in Python. For my use cases concurrent.futures is almost always better, so I'm not as familiar with them as I should be.
If there are any asyncio experts out there, discussion on what the correct handling of this case should be would be appreciated.
@Erotemic Sure, the clock keeps running and the elapsed time will be (at least) the requested sleep duration. But as you said: "await suspends execution of the coroutine". Therefore it seems reasonable to me that the execution time doesn't account for time spent in an async call.
I can buy that. For cases where other coroutines are taking resources it might not be fair to label that as execution time of the function that invoked the await. @fonfonx if you have a different opinion, I'd be interested to hear it.
I am not an expert of asyncio
but I would have expected to measure the total time spent in the asyncio.sleep
instruction even if await suspends execution of the coroutine because when I use line_profiler
to find out which instructions slow down my code I would expect to see that the sleep instruction indeed took time while the current result does not show that.
For example when the code makes calls to async services that take time I think the line_profiler should display that and the fact that those services are called in an async or sync manner should not affect the output of the profiler.
For instance using the line_profiler on this little function will mislead me by telling me that the for loop takes most of the time of my code while it is the sleep instruction...
import asyncio
import time
from line_profiler import LineProfiler
profiler = LineProfiler()
@profiler
async def test():
start_time = time.perf_counter()
total = 0
for i in range(1000):
total += i
await asyncio.sleep(5)
final_time = time.perf_counter()
print(f"Elasped time: {final_time-start_time}")
return total
if __name__ == "__main__":
asyncio.run(test())
profiler.print_stats()
Elasped time: 5.002110341000001
Timer unit: 1e-06 s
Total time: 0.000902 s
File: test.py
Function: test at line 8
Line # Hits Time Per Hit % Time Line Contents
==============================================================
8 @profiler
9 async def test():
10 1 4.0 4.0 0.4 start_time = time.perf_counter()
11 1 1.0 1.0 0.1 total = 0
12 1001 363.0 0.4 40.2 for i in range(1000):
13 1000 401.0 0.4 44.5 total += i
14 1 31.0 31.0 3.4 await asyncio.sleep(5)
15 1 2.0 2.0 0.2 final_time = time.perf_counter()
16 1 99.0 99.0 11.0 print(f"Elasped time: {final_time-start_time}")
17 1 1.0 1.0 0.1 return total
@fonfonx But while your thread/process/CPU is busy running the for loop and can't do anything else, it is completely free to execute other tasks when sleeping with asyncio.sleep
.
We might think about a slightly different setup. When profiling some function which is awaiting your test()
, I'd expect the line profiler to show the amount of time await test()
is actually busy running computations like the for loop, but excluding the time spent in asyncio.sleep
.
Another point of view: What do you expect when running 10 sleeps of 1 second in parallel? Does that count as 10 seconds execution time? I don't think so.
@falkoschindler I agree with you on your point about the parallel execution. However in my tiny example when sleeping with asyncio.sleep
the cpu cannot execute lines that are after this instruction, and the await
blocks the execution until sleeping is complete. I discovered the line_profiler tool because I was looking for a tool to profile my code line by line, in a sequential manner. I understand that we should not add up the execution times of parallel tasks but for sequential tasks I think it is the more natural behavior. Besides the wording Total time
is misleading with the current behavior.
When profiling some function which is awaiting my test()
I would actually expect the profiler to tell me that it took 5 seconds to run the function in order to let me investigate why it took so long.
import asyncio
from test import test
from line_profiler import LineProfiler
profiler = LineProfiler()
@profiler
async def main():
result = await test()
print(result)
await asyncio.sleep(5)
print("Hello World")
if __name__ == "__main__":
asyncio.run(main())
profiler.print_stats()
I would expect both the test call and the asyncio.sleep to be profiled in an similar manner since they both took around 5 seconds in my code, but this is not the case.
Total time: 0.000417 s
File: test2.py
Function: main at line 9
Line # Hits Time Per Hit % Time Line Contents
==============================================================
9 @profiler
10 async def main():
11 1 334.0 334.0 80.1 result = await test()
12 1 7.0 7.0 1.7 print(result)
13 1 31.0 31.0 7.4 await asyncio.sleep(5)
14 1 45.0 45.0 10.8 print("Hello World")
I will bring my five cents... Leaving the asyncio.sleep()
behind. As per my understanding for any asyncio
function only CPU-effective time is counted. So e.g. await aiohttp.session.get(...)
also only counts time when it really "works/blocks" (receives, parses data). IO-bound waiting is ignored.
The background of this is clear: asyncio.loop is free and can handle other tasks within the code, but on the other hand it makes it impossible to use line_profiler for finding bottlenecks in such cases. Time calculation for such steps will be misleading for most users...
IO-bound methods are a good point to think about: When a get request takes 3 seconds, but most of the time the task is idle - what should line profiler output? Maybe it depends. Personally I use a profiler to find CPU-intensive code rather than IO bottlenecks. In some projects I have multiple tasks running, each with a loop that does some work and then sleeps for 0.1 s. When profiling I need to know who is consuming too much processing resources so that other tasks are not called in time. Knowing that each of them regularly sleeps for 100 ms does not help. But again: Maybe it depends and could be an optional profiler setting, if both behaviors are equally easy to implement.
I would suggest to close this card, since the original problem of empty profiling reports for async functions is solved. The details about how things should be counted seem to be non-trivial, but go beyond @sirex's original issue.
The results of asynchronous code profiling are absolutely incorrect. For example, I want to measure the runtime of an asynchronous function, but I get results that are useless.
For example:
import asyncio
import time
@profile
async def foo():
...
async def main():
ts = time.monotonic()
await foo()
te = time.monotonic() - ts
print('foo() elapsed time:', te)
asyncio.run(main())
And I get:
profiler:
Total time: 0.0046907 s
time:
foo() elapsed time: 0.7343...
The profiler's results are meaningless garbage in this case, unfortunately. The line profiler doesn't work with asynchronous code. It's better to show nothing than to show nonsense.
Old issue but hit it today. Can't this just be a configuration flag? Don't think it's very useful to debate which is the "right" way to do it. In some cases, you care about overall time it took to execute a function (e.g. when you want a function to execute very fast), so time spent in awaits should be counted towards the total. In others you want to know how much resources a function is using, in which case await's can be ignored (or counted in a smart way).
When I run line profile on an async function, it does not record anything.
Here is what I get:
Here I moved all the code from
prepare_data
function to_temp
, just to make profiler work.I'm using line-profiler 3.0.2. On Python 3.8.2.