DCMLab / wavescapes

Code and documentation for wavescapes
GNU General Public License v3.0
1 stars 0 forks source link

Memory leak when drawing larger wavescapes #18

Open johentsch opened 2 years ago

johentsch commented 2 years ago

When pieces consist of more than 1000 pitch class vectors, drawing wavescapes becomes prohibitively slow and memory-heavy up to the point that the process is killed by the OS. As discussed, the most probable reason is the individual creation of a large number (>500k) matplotlib.patches.Polygon objects and drawing them one after the other. This can be avoided by using a single matplotlib.collections.PolyCollection. A minimal example adapted from matplotlib.org goes like this:

from matplotlib.transforms import Affine2D
from matplotlib.collections import PolyCollection
import matplotlib.pyplot as plt
from matplotlib.colors import to_rgba
import numpy as np
# define the Polygon
diamond = np.array([
    (-0.1, 0),
    ( 0  , 0.2),
    ( 0.1, 0),
    ( 0  ,-0.2)
])
# define all centers
offsets = np.array([
    (0.2, 0),
    (0.1, 0.1,),
    (0.3, 0.3)
])
clrs = [to_rgba(c) for c in ('salmon', 'magenta', 'skyblue')]
fig, ax = plt.subplots(1, 1)
ax_trans = ax.transData
dpi_trans = Affine2D().scale(fig.dpi*1)
print("Matrix for transforming offsets to figure's coordinates:\n", ax_trans.get_matrix())
print("Matrix for transforming transforming coordinates by DPI:\n", dpi_trans.get_matrix())
diamond_collection = PolyCollection([diamond], offsets=offsets, offset_transform=ax_trans)
diamond_collection.set_transform(dpi_trans) 
ax.add_collection(diamond_collection)
diamond_collection.set_color(clrs)
Matrix for transforming offsets to figure's coordinates:
 [[334.8    0.    54.  ]
 [  0.   217.44  36.  ]
 [  0.     0.     1.  ]]
Matrix for transforming transforming coordinates by DPI:
 [[72.  0.  0.]
 [ 0. 72.  0.]
 [ 0.  0.  1.]]

output

This scales easily to 500k diamonds, this figure here takes 8-10 seconds to create (compared to 8-11 minutes for a wavescape of that size):

from matplotlib.transforms import Affine2D
from matplotlib.collections import PolyCollection
import matplotlib.pyplot as plt
from matplotlib.colors import to_rgba
import numpy as np
diamond = np.array([
    (-0.1, 0),
    ( 0  , 0.2),
    ( 0.1, 0),
    ( 0  ,-0.2)
])
n_patches = 500000
offsets = np.random.rand(n_patches, 2)
clrs = np.random.rand(n_patches,3)
fig, ax = plt.subplots(1, 1, figsize=(20, 20))
ax_trans = ax.transData
dpi_trans = Affine2D().scale(fig.dpi*1)
diamond_collection = PolyCollection([diamond], offsets=offsets, offset_transform=ax_trans)
diamond_collection.set_transform(dpi_trans) 
ax.add_collection(diamond_collection)
diamond_collection.set_color(clrs)

output

johentsch commented 2 years ago

Here's some tentative code for producing a wavescape using a PolyCollection:

from matplotlib.transforms import Affine2D
from matplotlib.collections import PolyCollection
import matplotlib.pyplot as plt
import numpy as np

def make_wavescape(offsets, colors, scalar1=1, scalar2=1, figsize=(20,20)):
    diamond = np.array([
        (-0.5*scalar1, 0),
        ( 0         , scalar1),
        ( 0.5*scalar1, 0),
        ( 0         ,-scalar1)
    ])
    fig, ax = plt.subplots(1, 1, figsize=figsize)
    ax_trans = ax.transData
    dpi_trans = fig.dpi_scale_trans
    scale_trans = Affine2D().scale(scalar2)
    diamond_collection = PolyCollection([diamond], offsets=offsets, offset_transform=ax_trans)
    diamond_collection.set_transform(dpi_trans + scale_trans) 
    ax.add_collection(diamond_collection)
    diamond_collection.set_color(colors)

def get_offsets(n):
    width = 1 / n
    half_width = width / 2
    return np.array([(x*half_width,  y*width) for y in range(n) for x in range(y+1, 2*n-y, 2)])

n = 12
offsets = get_offsets(n)
clrs = np.random.rand(
    int(n*(n+1)/2),
    3
)
make_wavescape(offsets, clrs)

output

The problem is that so far I couldn't figure out how to scale the diamonds correctly as it seems to depend on the correct interplay between

The arguments scalar1 and scalar2 allow for playing around with these.

As a proof of concept, I used the above code to plot an existing wavescape. The colors are actually correct, they just look exaggerated because of the scaling issue:

import gzip
from wavescapes.color import circular_hue
file = 'l105_masques-0c+indulge.npy.gz'
with gzip.GzipFile(file, 'r') as f:
    mag_phase_mx = np.load(f, allow_pickle=True)
rgba_utm = circular_hue(mag_phase_mx[:,:,3], True, as_html=False)
clrs = np.array([c for i, row in enumerate(rgba_utm) for c in row[i:]])
n = mag_phase_mx.shape[0]
offsets = get_offsets(n)
make_wavescape(offsets, clrs)
output l105_masques-c4-0c+indulge
generated with PolyCollection (only plot area, no markup) original/correct

The cell took 53 seconds to execute. To reproduce, here's the link to download l105_masques-0c+indulge.npy.gz. If you cannot figure out the scaling issue either, @cedricviaccoz, maybe that would be a question for StackOverflow...