Open ianozsvald opened 5 years ago
On the README
there's an example snippet of code, the current behaviour is a little different to what's documented in the README
:
Line # Mem usage Increment Line Contents
================================================
1 49.043 MiB 49.043 MiB @profile
2 def my_func():
3 56.453 MiB 7.410 MiB a = [1] * (10 ** 6)
4 209.078 MiB 152.625 MiB b = [2] * (2 * 10 ** 7)
5 56.676 MiB 0.000 MiB del b
6 56.676 MiB 0.000 MiB return a
In the above lines 3 & 4 are the same but line 5 with del
does not track the decrease in RAM used (in the README
this decrease is tracked.
I'll tentatively propose that there's a problem tracking pure-Python code vs numpy
code. I've written a pair of functions which create a 10,000,000 element list or array and then create a second (the first is ones
, the second performs += 1
). We'd expect both the Python version and the numpy
version to allocate the same RAM for the first and second operations.
From myfn_np received back (10000000,)
From myfn_py received back 10000000
Filename: bug_memory_profiler.py
Line # Mem usage Increment Line Contents
================================================
12 148.418 MiB 148.418 MiB @profile
13 def myfn_py():
14 224.457 MiB 76.039 MiB ones = [1] * NBR_ITEMS
15 301.574 MiB 0.773 MiB twos = [n+1 for n in ones]
16 301.574 MiB 0.000 MiB return twos
Filename: bug_memory_profiler.py
Line # Mem usage Increment Line Contents
================================================
18 70.938 MiB 70.938 MiB @profile
19 def myfn_np():
20 # both lines should generate the same memory increase
21 # as twos is a new object, the same size as ones
22 147.168 MiB 76.230 MiB ones = np.ones(NBR_ITEMS)
23 224.570 MiB 77.402 MiB twos = ones+1
24 224.570 MiB 0.000 MiB return twos
What I'm seeing is that the numpy
version has correct tracking (that's the second example) and the Python version has the incorrect Increment
column. The above was run twice, both times the behaviours were the same.
Maybe in Python 3 the AST tracking code (this is a guess) needs to be updated due to changes in how Python runs?
When I ran the above with a smaller length (1,000,000 elements, 10* smaller) both the Python and numpy
versions tracked similar +8MB allocations on their respective lines.
@ianozsvald First time I've used this package and I've noticed also an strange behaviour.
With version 0.55.0 I wouldn't get negative increments (as exposed in the pypi package on the examples) while with 0.54.0 the behaviour would match... maybe it's an introduced bug with 0.55.0?
Maybe one of the fixes for #195 is somewhat related with this?
Hi @ianozsvald ! Thanks a lot for reporting this.
My guess is that in the Python version, Python is silently executing the line multiple times as if it were a loop, which has always resulted in problematic reporting. Pull requests are highly appreciated :-)
Hey @fabianp . A colleague of mine @dxe4 is trying to diagnose this at a hackathon today, I'm going to post the original Python file here for him (and others if you want to try!).
Note that this runs for Python 3.7, it ought to work with Python 2.7 if you add:
from __future__ import division # 1/2 == 0.5, as in Py3
from __future__ import print_function # force use of print("hello")
$ more julia1_memoryprofiler.py
"""Julia set generator without optional PIL-based image drawing"""
import time
# area of complex space to investigate
x1, x2, y1, y2 = -1.8, 1.8, -1.8, 1.8
c_real, c_imag = -0.62772, -.42193
@profile
def calculate_z_serial_purepython(maxiter, zs, cs):
"""Calculate output list using Julia update rule"""
output = [0] * len(zs)
for i in range(len(zs)):
n = 0
z = zs[i]
c = cs[i]
while n < maxiter and abs(z) < 2:
z = z * z + c
n += 1
output[i] = n
return output
@profile
def calc_pure_python(draw_output, desired_width, max_iterations):
"""Create a list of complex co-ordinates (zs) and complex parameters (cs), build Julia set and display"""
x_step = (x2 - x1) / desired_width
y_step = (y1 - y2) / desired_width
x = []
y = []
ycoord = y2
while ycoord > y1:
y.append(ycoord)
ycoord += y_step
xcoord = x1
while xcoord < x2:
x.append(xcoord)
xcoord += x_step
# set width and height to the generated pixel counts, rather than the
# pre-rounding desired width and height
# build a list of co-ordinates and the initial condition for each cell.
# Note that our initial condition is a constant and could easily be removed,
# we use it to simulate a real-world scenario with several inputs to our function
zs = []
cs = []
for ycoord in y:
for xcoord in x:
zs.append(complex(xcoord, ycoord))
cs.append(complex(c_real, c_imag))
print("Length of x:", len(x))
print("Total elements:", len(zs))
start_time = time.time()
output = calculate_z_serial_purepython(max_iterations, zs, cs)
end_time = time.time()
secs = end_time - start_time
print(calculate_z_serial_purepython.__name__ + " took", secs, "seconds")
assert sum(output) == 33219980 # this sum is expected for 1000^2 grid with 300 iterations
# Calculate the Julia set using a pure Python solution with
# reasonable defaults for a laptop
# set draw_output to True to use PIL to draw an image
#calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)
calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)
code:
@profile
def myfn_py():
ones = [1] * NBR_ITEMS
return ones
@profile
def myfn_np():
ones = np.ones(NBR_ITEMS)
twos = ones+1
return twos
run 1:
myfn_np()
myfn_py()
output 1:
Line # Mem usage Increment Line Contents
================================================
7 54.656 MiB 54.656 MiB @profile
8 def myfn_py():
9 54.684 MiB 0.027 MiB ones = [1] * NBR_ITEMS
10 54.684 MiB 0.000 MiB return ones
Filename: prof.py
Line # Mem usage Increment Line Contents
================================================
12 47.012 MiB 47.012 MiB @profile
13 def myfn_np():
14 50.836 MiB 3.824 MiB ones = np.ones(NBR_ITEMS)
15 54.656 MiB 3.820 MiB twos = ones+1
16 54.656 MiB 0.000 MiB return twos
run 2:
myfn_np()
time.sleep(5)
myfn_py()
output 2:
Line # Mem usage Increment Line Contents
================================================
7 54.645 MiB 54.645 MiB @profile
8 def myfn_py():
9 58.488 MiB 3.844 MiB ones = [1] * NBR_ITEMS
10 58.488 MiB 0.000 MiB return ones
Filename: prof.py
Line # Mem usage Increment Line Contents
================================================
12 46.996 MiB 46.996 MiB @profile
13 def myfn_np():
14 50.820 MiB 3.824 MiB ones = np.ones(NBR_ITEMS)
15 54.641 MiB 3.820 MiB twos = ones+1
16 54.641 MiB 0.000 MiB return twos
run 3 (reversed order):
myfn_py()
myfn_np()
output 3:
Line # Mem usage Increment Line Contents
================================================
7 46.977 MiB 46.977 MiB @profile
8 def myfn_py():
9 50.828 MiB 3.852 MiB ones = [1] * NBR_ITEMS
10 50.828 MiB 0.000 MiB return ones
Filename: prof.py
Line # Mem usage Increment Line Contents
================================================
12 50.828 MiB 50.828 MiB @profile
13 def myfn_np():
14 50.828 MiB 0.000 MiB ones = np.ones(NBR_ITEMS)
15 54.648 MiB 3.820 MiB twos = ones+1
16 54.648 MiB 0.000 MiB return twos
print(sys.getsizeof(ones) / (1024 * 1024))
in myfn_py
gives 3.81475830078125
so running myfn_np
before myfn_py
causes ones
in myfn_py
to report 0.027 opposed to 3.8 but a sleep in between prevents this. and vice versa calling myfn_py
before myfn_np
causesones
in myfn_np
to report 0
i may be wrong, i think what happens here is that even if some objects are garbage collected the OS may have not freed the memory yet, so measuring memory from an OS perspective may result to the above
@dxe4 you may want to force gc.collect()
in your examples. Also Python's memory management has interesting allocation / free / reuse strategies which make predicting memory usage difficult. It involves blocks / pools / arenas. I think the arenas are what are allocated but the memory for arenas can only get freed if all the pools inside are unused. https://rushter.com/blog/python-memory-managment/
@ianozsvald in your second comment on this thread you mentioned the example in the README not correctly showing the effects of the del
statement. I noticed this as well, running on two separate machines (Mac OS X and Linux). I tracked the introduction of the issue to between releases 0.54 and 0.55, and after looking at the commits in that period, I suspect it's due to PR #202 . This also matches with what @brupelo observed. However, when I try to profile some of my code under 0.54, I'm getting a different set of strange results.
It looks like #280 tried to address many of these factors. You may want to rebuild and try again to see if that solves your problem.
Thanks Brian for looking into this!
On Mon, Sep 7, 2020 at 9:37 PM Brian Zhang notifications@github.com wrote:
It looks like #280 https://github.com/pythonprofilers/memory_profiler/pull/280 tried to address many of these factors. You may want to rebuild and try again to see if that solves your problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pythonprofilers/memory_profiler/issues/236#issuecomment-688569803, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACDZB7KF4PIPUWPKHGTWPDSEWDGZANCNFSM4HZMC5BQ .
Just reporting that I just tried the README code on mac big sur 11.2 with pip installed memory_profiler 0.58.0 and didn't see the intended decrement too
Line # Mem usage Increment Occurences Line Contents
============================================================
1 37.434 MiB 37.434 MiB 1 @profile
2 def my_func():
3 45.066 MiB 7.633 MiB 1 a = [1] * (10 ** 6)
4 197.660 MiB 152.594 MiB 1 b = [2] * (2 * 10 ** 7)
5 197.660 MiB 0.000 MiB 1 del b
6 197.660 MiB 0.000 MiB 1 return a
Hi @fabianp , it has been ages, I hope you're well. I'm seeing a bug related to #234.
I'm updating my High Performance Python book for the 2nd edition. Compared to the first edition's report of
memory_profiler
's behaviour, the Increment is definitely looking odd.I'm using a fresh install of
memory_profiler
,psutil
and Python 3.7 in Anaconda on Linux.I have two functions, I've profiled them each separately and once together. They have similar (but not identical) outputs, suggesting that the comment by @soundgnome in #234 about nested calls may not be so clear cut.
I'll lay out the 3 scenarios in case you have thoughts. This is a Julia set demo from my Profiling chapter. An outer function prepares x & y arrays of complex coordinates which consumes RAM, an inner function then iterates over these, calculating the Julia set in a cpu-bound scenario.
Scenario 1 - just profiling the inner function
calculate_z_serial_purepython
:In the earlier edition with Python 2.7 line 12 cost +7MB, line 13 cost +32MB (due I expect to
range
not being a generator in Python 2.7 and so making a full list which we avoid now with Python 3.x). Line 17 costs +0.2MB, previously it cost 0MB (but this might have been masked by the process' overall growth due to therange
statement?).Scenario 2 - profiling the outer function
calc_pure_python
:In the earlier edition with Python 2.7 line 46 above had cost +79MB, here the Increment is 0MB and the few following lines cost +0.2+0.5+0.5==+1.2MB. The Mem Usage at line 46 suggests a +77MB change.
The Generation of
output
in line 54 here costs +10MB, before it cost +32MB.Scenario 3 - Combined profiling of both functions:
In the above we see +136MB in line 54 (in Scenario 2 above it grew by +10MB), in the Python 2.7 version it grew by +32MB. The Mem Usage change at line 54 is +10MB.
In all 3 scenarios we see evidence that the Increment column is not matching to the Mem Usage column, so the Increment column is looking pretty unhelpful.
The Mem Usage column still looks correct and will have obvious value in a memory-diagnosis exercise so I can still use this in the book, but I'm wondering if we have an explanation for the odd Increment behaviour?
Has Python 3.x changed its default memory allocation behaviour for lists? I'm pretty sure there's no clever unboxing or other clever techniques (unlike in PyPy) in CPython.
Cheers (and I hope you're well!), Ian.