taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.52k stars 2.29k forks source link

kernel_profiler consumes excessive memory #5864

Open Yihao-Shi opened 2 years ago

Yihao-Shi commented 2 years ago

The demo progressively consumes CPU memory in the process of simulation When '_kernelprofiler=True'.

import taichi as ti
import math
ti.init(arch=ti.cpu, default_fp=ti.f32, debug=True, kernel_profiler=True)

@ti.data_oriented
class DEM:
    def __init__(self, cmType, domain, periodic, algorithm, 
                 gravity, timeStep, bodyNum, wallNum, matNum, 
                 searchAlgorithm, max_particle_num, max_contact_num):
        self.CmType = cmType
        self.Domain = domain
        self.Periodic = periodic
        self.Algorithm = algorithm
        self.Gravity = gravity
        self.Dt = ti.field(float, shape=())
        self.Dt[None] = timeStep
        self.SearchAlgorithm = searchAlgorithm

        self.max_particle_num = int(max_particle_num)
        self.max_contact_num = int(max_contact_num)
        self.bodyNum = int(bodyNum)
        self.wallNum = int(wallNum)
        self.matNum = int(matNum)

        if self.Algorithm == 0:
            print("Integration Scheme: Euler")
        elif self.Algorithm == 1:
            print("Integration Scheme: Verlet")

        self.BodyInfo = ti.Struct.field({                                         # List of body parameters
            "ID": int,                                                            # Body ID
            "Mat": int,                                                           # Material name
            "shapeType": int,                                                     # /ShapeType of DEM_PARTICLE/ 0 for sphere; 1 for Cuboid; 2 for Tetrahedron 
            "GenerateType": int,                                                  # /Generate type/ 0 for create; 1 for generate; 2 for fill in box;

            # Generate collections of balls
            "pos0": ti.types.vector(3, float),                                    # Initial position 
            "len": ti.types.vector(3, float),                                     # Size of box
            "rlo": float,                                                         # the minimum Radius
            "rhi": float,                                                         # the maximum Radius
            "pnum": int,                                                          # the number of particle in the group

            "fex":ti.types.vector(3, float),                                      # The externel force
            "tex":ti.types.vector(3, float),                                      # The externel torque
            "v0": ti.types.vector(3, float),                                      # Initial velocity
            "orientation": ti.types.vector(3, float),                             # Initial orientation
            "w0": ti.types.vector(3, float),                                      # Initial angular velocity
            "fixedV": ti.types.vector(3, int),                                    # Fixed velocity
            "fixedW": ti.types.vector(3, int),                                    # Fixed angular velocity
            "DT": ti.types.vector(3, float)                                       # DT
        }, shape=(self.bodyNum,))

        self.WallInfo = ti.Struct.field({                                         # List of wall parameters
            "ID": int,                                                            # Body ID
            "Mat": int,                                                           # Material name
            "point1": ti.types.vector(3, float),                                  # The vertex of the wall
            "point2": ti.types.vector(3, float),                                  # The vertex of the wall
            "point3": ti.types.vector(3, float),                                  # The vertex of the wall
            "point4": ti.types.vector(3, float),                                  # The vertex of the wall
            "norm": ti.types.vector(3, float),                                    # The wall normal (actived side)
            "fex":ti.types.vector(3, float),                                      # The externel force
            "tex":ti.types.vector(3, float),                                      # The externel torque
            "v0": ti.types.vector(3, float),                                      # Initial velocity
            "w0": ti.types.vector(3, float),                                      # Initial angular velocity
            "fixedV": ti.types.vector(3, int),                                    # Fixed velocity
            "fixedW": ti.types.vector(3, int),                                    # Fixed angular velocity
            "DT": ti.types.vector(3, float)                                       # DT
            }, shape=(self.wallNum,))

        self.MatInfo = ti.Struct.field({                                          # List of material parameters
            "Modulus": float,                                                     # Shear Modulus
            "possion": float,                                                     # Possion ratio
            "Kn": float,                                                          # Hardening
            "Kt": float,                                                          # Cohesion coefficient
            "Mu": float,                                                          # Angle of internal friction
            "Rmu": float,                                                         # Angle of dilatation
            "cohesion": float,                                                    # Bond cohesion
            "ForceLocalDamping": float,                                           # Local Damping
            "TorqueLocalDamping": float,                                          # Local Damping
            "NormalViscousDamping": float,                                        # Viscous Damping
            "TangViscousDamping": float,                                          # Viscous Damping
            "ParticleRho": float                                                  # Particle density !NOT MACRO DENSITY
        }, shape=(self.matNum,))

# Solvers
def Solver(dem, TIME, saveTime, CFL, vtkPath, ascPath, adaptive):
    SolverEuler(dem, TIME, saveTime, CFL, vtkPath, ascPath, adaptive)

#  --------------------------------------- Euler Algorithm ----------------------------------------- #
def SolverEuler(dem, TIME, saveTime, CFL, vtkPath, ascPath, adaptive):
    t, step, printNum = 0., 0, 0
    while t <= TIME:
        t += dem.Dt[None]

# Test for Linear contact models // Rotation
if __name__ == "__main__":
    # DEM domain
    dem = DEM(cmType=0,                                                 # /Contact model type/ 0 for linear model; 1 for hertz model; 2 for linear rolling model; 3 for linear bond model
                  domain=ti.Vector([0.5, 0.5, 45]),                             # domain size
                  periodic=ti.Vector([0, 0, 0]),                            # periodic boundary condition
                  algorithm=0,                                              # /Integration scheme/ 0 for Euler; 1 for Verlet; 2 for sympletic
                  gravity=ti.Vector([0., 0., 0.]),                        # Gravity (body force)
                  timeStep=1.e-6,                                           # Time step
                  bodyNum=2,                                               # Taichi field of body parameters *Number of Body*
                  matNum=1,                                                 # Taichi field of material parameters *Number of Material*
                  wallNum=1,                                                # Taichi field of material parameters *Number of Wall*
                  searchAlgorithm=2,                                        # /Search Algorithm/ 0 for sorted based; 1 for hash cell; 
                  max_particle_num=2,                                       # the number of particles
                  max_contact_num=1,                                        # the number of contacts
                  )

    # Physical parameters of particles
    dem.MatInfo[0].Kn = 1e6                                                 # Contact normal stiffness
    dem.MatInfo[0].Kt = 1e6                                                 # Contact tangential stiffness
    dem.MatInfo[0].Mu = 0.5                                                 # Friction coefficient
    dem.MatInfo[0].localDamping = [0., 0.]                                  # /Local damping/ Translation & Rolling
    dem.MatInfo[0].visDamping = [0., 0.]                                    # /Viscous damping/ Normal & Tangential
    dem.MatInfo[0].ParticleRho = 2650                                       # Particle density

    # DEM body domain
    rad = 0.025
    pos = ti.Vector([0.25, 0.25, 21.75])
    dem.BodyInfo[0].ID = 0                                                  # Body ID
    dem.BodyInfo[0].Mat = 0                                                 # Material Name of Body
    dem.BodyInfo[0].shapeType = 0                                           # /ShapeType of DEM_PARTICLE/ 0 for sphere; 1 for SDEM; 
    dem.BodyInfo[0].GenerateType = 0                                        # /Generate type/ 0 for create; 1 for generate; 2 for fill in box; 3 for wall
    dem.BodyInfo[0].pos0 = pos                                              # Initial position or center of sphere of Body
    dem.BodyInfo[0].rhi = rad                                                 # Particle radius
    dem.BodyInfo[0].rlo = rad                                                 # Particle radius
    dem.BodyInfo[0].pnum = 1                                                # The number of particle in the group
    dem.BodyInfo[0].v0 = ti.Vector([0, 0, 1])                              # Initial velocity
    dem.BodyInfo[0].w0 = ti.Vector([0, 0, 0])                               # Initial angular velocity
    dem.BodyInfo[0].fixedV = ti.Vector([0, 0, 0])                           # Fixed velocity
    dem.BodyInfo[0].fixedW = ti.Vector([0, 0, 0])                           # Fixed angular velocity
    dem.BodyInfo[0].DT = ti.Vector([0, 1e6, 5])                             # DT

    pos = ti.Vector([0.25, 0.25, 22.25])
    dem.BodyInfo[1].ID = 1                                                  # Body ID
    dem.BodyInfo[1].Mat = 0                                                 # Material Name of Body
    dem.BodyInfo[1].shapeType = 0                                           # /ShapeType of DEM_PARTICLE/ 0 for sphere; 1 for SDEM; 
    dem.BodyInfo[1].GenerateType = 0                                        # /Generate type/ 0 for create; 1 for generate; 2 for fill in box; 3 for wall
    dem.BodyInfo[1].pos0 = pos                                              # Initial position or center of sphere of Body
    dem.BodyInfo[1].rhi = rad                                                 # Particle radius
    dem.BodyInfo[1].rlo = rad                                                 # Particle radius
    dem.BodyInfo[1].pnum = 1                                                # The number of particle in the group
    dem.BodyInfo[1].v0 = ti.Vector([0, 0, -1])                              # Initial velocity
    dem.BodyInfo[1].w0 = ti.Vector([0, 0, 0])                               # Initial angular velocity
    dem.BodyInfo[1].fixedV = ti.Vector([0, 0, 0])                           # Fixed velocity
    dem.BodyInfo[1].fixedW = ti.Vector([0, 0, 0])                           # Fixed angular velocity
    dem.BodyInfo[1].DT = ti.Vector([0, 1e6, 5])                             # DT

    # Create MPM domain
    '''dem.AddMaterial()
    dem.AddBodies()
    dem.AddWall()
    dem.AddContactList()
    dem.AddNeighborList(GridSize=0.05, max_particle_per_cell=10)'''

    # Solve
    TIME: float = 10                                                         # Total simulation time
    saveTime: float = 0.02                                                   # save per time step
    CFL = 0.5                                                               # Courant-Friedrichs-Lewy condition
    vtkPath = './vtkDataTest1'                                              # VTK output path
    ascPath = './vtkDataTest1/postProcessing'                               # Monitoring data path

    Solver(dem, TIME, saveTime, CFL, vtkPath, ascPath, adaptive=False)
    ti.profiler.print_kernel_profiler_info()
ailzhang commented 2 years ago

@Yihao-Shi On my computer the CPU memory used increased from around 14.6G-> 14.8G in a few minutes, does this match what you see on your end? If so I believe this is kinda expected in the current implementation as if you turn on kernel_profiler=True, it starts recording some metrics for each run and that's the main source of memory increase. In the future we should consider giving finer control over profiler start/end to users (like context manager). Btw you can clear those memory usage by calling ti.profiler.clear_kernel_profiler_info() for now.

Yihao-Shi commented 2 years ago

@ailzhang However the CPU memory on my computer increased from 4.2G to 6.0G, almost 2G consumption. My CPU is AMD5800X. So I think it might be wired. I have simplified my code in 22 lines:

import taichi as ti
ti.init(arch=ti.cpu, default_fp=ti.f32, debug=True, kernel_profiler=True)

@ti.data_oriented
class DEM:
    def __init__(self, timeStep):
        self.Dt = ti.field(float, shape=())
        self.Dt[None] = timeStep

def Solver(dem, TIME):
    t, step, printNum = 0., 0, 0
    while t <= TIME:
        t += dem.Dt[None]

if __name__ == "__main__":
    dem = DEM(timeStep=1.e-6)
    TIME: float = 10                                                         # Total simulation time
    Solver(dem, TIME)
    ti.profiler.print_kernel_profiler_info()

It is interesting to find that when replace the line 14 by t += 1, the CPU memory consumption stays in 4.2G in the whole process of simulation.

ailzhang commented 2 years ago

@Yihao-Shi Just to double check, what taichi version do you have?

Yihao-Shi commented 2 years ago

@ailzhang This is my Taichi version [Taichi] version 1.1.0, llvm 10.0.0, commit f5bb6464, linux, python 3.7.5 [Taichi] Starting on arch=x64

ailzhang commented 2 years ago

@Yihao-Shi I couldn't repro the stable memory consumption usage when replacing line 14 by t += 1, it error out in my case complaining about divide by zero. But I can reproduce the case where CPU memory consumption increases when kernel_profiler is on. And the memory consumption stops increasing if I add ti.profiler.clear_kernel_profiler_info() at the end of the while loop. So I'm mostly certain that this memory consumption comes from CPU kernel profiler. We should definitely look into it later when revamping the profiler design. Thanks for reporting!