Implements SlimCUDSSource for improved memory consumption (refactored solution)

stefanoborini commented 8 years ago

Ditches the old implementation based on CUDSSource. Fixes #35

kitchoi commented 8 years ago

Try this:

from simphony_mayavi.tests.testing_utils import DummyEngine
from simphony_mayavi.sources.api import SlimCUDSSource
from mayavi import mlab
dummy_engine = DummyEngine()
source = SlimCUDSSource(cuds=dummy_engine.get_dataset("particles"),
                                             point_scalars="MASS", point_vectors="")

print source.point_scalars_name # gives "MASS", which is good
print source.point_vectors_name # gives "", which is good

# Now send the source to mayavi
mlab.pipeline.glypy(source, mode="arrow", scale_mode="vectors")
mlab.pipeline.glyph(source, mode="sphere", scalar_mode="scalars")

print source.point_scalars_name # gives "MASS", which is good
print source.point_vectors_name # gives "VELOCITY", which is not good

Tried with CUDSSource which does not have this problem. Nevertheless the above should be added to the tests for CUDSSource as well.

kitchoi commented 8 years ago

Performed some memory profiling. The saving in memory is less than expected. I have commented the _fill_datatype_enums in self.start for this memory profiling exercise.

from itertools import combinations_with_replacement

from mayavi import mlab
from memory_profiler import profile

from simphony.core.cuba import CUBA
from simphony.cuds.lattice import make_cubic_lattice
from simphony.cuds.particles import Particles, Particle
from simphony_mayavi.sources.api import SlimCUDSSource, CUDSSource

def create_lattice(nx, ny, nz):
    '''Create a Lattice with nx * nx * nz nodes with 3 scalars and 3 vectors'''
    lattice = make_cubic_lattice("lattice", 1., (nx, ny, nz))

    node_list = []

    for node in lattice.iter_nodes():
        # dummy value
        value = float(node.index[0])

        # 3 scalars
        for cuba in (CUBA.TEMPERATURE, CUBA.MASS, CUBA.DENSITY):
            node.data[cuba] = value

        # 3 vectors
        for cuba in (CUBA.VELOCITY, CUBA.FORCE, CUBA.ACCELERATION):
            node.data[cuba] = (value, 0., 0.)

        node_list.append(node)

    lattice.update_nodes(node_list)

    return lattice

# Class to be tested
SourceClass = SlimCUDSSource

# Log file for memory profiling
filename = "{}_memory.log".format(SourceClass.__name__.lower())
MEMORY_LOG = open(filename, "w")

# sample cuds
cuds = create_lattice(30, 40, 35)

@profile(stream=MEMORY_LOG)
def main():
    # Define Source, choose dataset
    src = SourceClass(cuds=cuds,
                      point_scalars="TEMPERATURE", point_vectors="")

    # use glyph to show the particles
    mlab.pipeline.glyph(src, scale_mode="scalar", mode="sphere")

if __name__ == "__main__":
    main()

For SlimCUDSSource

Line #    Mem usage    Increment   Line Contents
================================================
    40    284.7 MiB      0.0 MiB   @profile(stream=MEMORY_LOG)
    41                             def main():
    42                                 # Define Source, choose dataset
    43    284.7 MiB      0.0 MiB       src = SourceClass(cuds=cuds,
    44    289.4 MiB      4.7 MiB                         point_scalars="TEMPERATURE", point_vectors="")
    45                             
    46                                 # use glyph to show the particles
    47    505.7 MiB    216.3 MiB       mlab.pipeline.glyph(src, scale_mode="scalar", mode="sphere")

For CUDSSource

Line #    Mem usage    Increment   Line Contents
================================================
    40    284.7 MiB      0.0 MiB   @profile(stream=MEMORY_LOG)
    41                             def main():
    42                                 # Define Source, choose dataset
    43    284.7 MiB      0.0 MiB       src = SourceClass(cuds=cuds,
    44    291.9 MiB      7.2 MiB                         point_scalars="TEMPERATURE", point_vectors="")
    45                             
    46                                 # use glyph to show the particles
    47    685.0 MiB    393.1 MiB       mlab.pipeline.glyph(src, scale_mode="scalar", mode="sphere")

kitchoi commented 8 years ago

memory saving is close to 50% for the above test case, however we expect a lot more as the sample cuds has 3 scalar and 3 vector data...

kitchoi commented 8 years ago

Here are the new memory profiles.

SlimCUDSSource:

Line #    Mem usage    Increment   Line Contents
================================================
    40    284.8 MiB      0.0 MiB   @profile(stream=MEMORY_LOG)
    41                             def main():
    42                                 # Define Source, choose dataset
    43    284.8 MiB      0.0 MiB       src = SourceClass(cuds=cuds,
    44    286.2 MiB      1.4 MiB                         point_scalars="TEMPERATURE", point_vectors="")
    45                             
    46                                 # use glyph to show the particles
    47    505.5 MiB    219.3 MiB       mlab.pipeline.glyph(src, scale_mode="scalar", mode="sphere")

CUDSSource:

Line #    Mem usage    Increment   Line Contents
================================================
    40    284.8 MiB      0.0 MiB   @profile(stream=MEMORY_LOG)
    41                             def main():
    42                                 # Define Source, choose dataset
    43    284.8 MiB      0.0 MiB       src = SourceClass(cuds=cuds,
    44    292.3 MiB      7.5 MiB                         point_scalars="TEMPERATURE", point_vectors="")
    45                             
    46                                 # use glyph to show the particles
    47    688.0 MiB    395.7 MiB       mlab.pipeline.glyph(src, scale_mode="scalar", mode="sphere")

:(

kitchoi commented 8 years ago

Yay it actually works.

SlimCUDSSource:

Line #    Mem usage    Increment   Line Contents
================================================
    40    284.8 MiB      0.0 MiB   @profile(stream=MEMORY_LOG)
    41                             def main():
    42                                 # Define Source, choose dataset
    43    284.8 MiB      0.0 MiB       src = SourceClass(cuds=cuds,
    44    285.8 MiB      1.1 MiB                         point_scalars="", point_vectors="")
    45                             
    46                                 # use glyph to show the particles
    47    488.0 MiB    202.1 MiB       mlab.pipeline.glyph(src, scale_mode="scalar", mode="sphere")
    48                             
    49    522.5 MiB     34.5 MiB       src.point_scalars_name = "TEMPERATURE"
    50    522.5 MiB      0.0 MiB       src.update()

CUDSSource:

Line #    Mem usage    Increment   Line Contents
================================================
    40    284.8 MiB      0.0 MiB   @profile(stream=MEMORY_LOG)
    41                             def main():
    42                                 # Define Source, choose dataset
    43    284.8 MiB      0.0 MiB       src = SourceClass(cuds=cuds,
    44    291.9 MiB      7.2 MiB                         point_scalars="", point_vectors="")
    45                             
    46                                 # use glyph to show the particles
    47    687.6 MiB    395.6 MiB       mlab.pipeline.glyph(src, scale_mode="scalar", mode="sphere")
    48                             
    49    687.6 MiB      0.0 MiB       src.point_scalars_name = "TEMPERATURE"
    50    693.5 MiB      5.9 MiB       src.update()

CUDSource uses (395 - 202) = 195 MB in the vtk data versus SlimCUDSSource which uses 34 MBin vtk data. So indeed 1/6 of the memory usage, consistent with loading one array instead of all 6 arrays.

kitchoi commented 8 years ago

Looks good to me! :+1:

stefanoborini commented 8 years ago

Merging then

simphony / simphony-mayavi

Implements SlimCUDSSource for improved memory consumption (refactored solution) #175