Hi, I'm attempting to benchmark a project, and I'm getting a really weird result. There are call paths in the flame graph that cannot really be part of the program. Here's a minimal, self-contained example of my problem:
import cProfile
import subprocess
def a():
s = 0
for i in range(1000):
s += i
return s
def b():
s = 0
for i in range(1000):
s += i ** 2
return s
def dispatch(func):
return func()
def a_caller():
return dispatch(a)
def b_caller():
return dispatch(b)
def to_profile():
return a_caller() + b_caller()
stats_file = 'test.prof'
flame_file = 'test.svg'
cProfile.run('to_profile()', stats_file)
subprocess.check_call(f'flameprof {stats_file} > {flame_file}', shell=True)
In this program, the function a_caller() calls a(), and the function b_caller() calls b(). There is no path for a_caller() to call b(), although the function dispatch() will appear on both stacks.
This is what the profile looks like:
The profile reports that a_caller() spends most of its time inside of b(). Removing dispatch() and calling a() or b() directly provides the correct result.
This is a problem for me because any numpy operation I do goes through a single numpy function, <built-in method numpy.core._multiarray_umath.implement_array_function>. For example, in this part of the trace, I'm calling np.repeat(). As far as I know, that shouldn't invoke cumsum(), hstack(), or delete().
Any ideas on how to improve this, or things I might be doing wrong? Is this just an intrinsic limitation of cProfile?
Hi, I'm attempting to benchmark a project, and I'm getting a really weird result. There are call paths in the flame graph that cannot really be part of the program. Here's a minimal, self-contained example of my problem:
In this program, the function a_caller() calls a(), and the function b_caller() calls b(). There is no path for a_caller() to call b(), although the function dispatch() will appear on both stacks.
This is what the profile looks like:
The profile reports that a_caller() spends most of its time inside of b(). Removing dispatch() and calling a() or b() directly provides the correct result.
This is a problem for me because any numpy operation I do goes through a single numpy function,
<built-in method numpy.core._multiarray_umath.implement_array_function>
. For example, in this part of the trace, I'm calling np.repeat(). As far as I know, that shouldn't invoke cumsum(), hstack(), or delete().Any ideas on how to improve this, or things I might be doing wrong? Is this just an intrinsic limitation of cProfile?