Memory error while compiling large system of equations

Ziaeemehr commented 3 years ago

I have almost the same problem as #21 . I am using jitcode_lyap for 500 nodes, Kuramoto model.

I am calculating the Lyapunov dimension (the number of Lyaponov exponents that their accumulation is positive). I am running the code on a Linux Ubuntu 18.04 machine with 32 GB RAM.

It can be compiled with n_lyap=1 in 300 seconds (~2.5 GB of RAM) but I need to consider about 30 to 50 exponents that make the code crash.

Sorry the code is a bit long.

Here is my try:

import os
import warnings
import numpy as np
import pylab as plt
from time import time
import networkx as nx
from os.path import join
from scipy.stats import sem
from scipy.stats.morestats import Std_dev
from symengine import sin, cos
from jitcode import jitcode_lyap, y
from joblib import Parallel, delayed
from numpy.random import normal, uniform

if not os.path.exists("data"):
    os.makedirs("data")

# -------------------------------------------------------------------
# def kuramotos_f_II():
#     for i in range(N):
#         coupling_sum = sum(sin(y(j)-y(i))
#                            for j in range(N)
#                            if adj[i, j])
#         yield omega[i] + g * coupling_sum
# -------------------------------------------------------------------

def kuramotos_f_II(adj):
    dXdt = [0.0] * N
    for i in range(N):
        coupling_sum = sum(sin(y(j)-y(i))
                           for j in range(N)
                           if adj[i, j])
        dXdt[i] = omega[i] + g * coupling_sum
    return dXdt
# -------------------------------------------------------------------

def main_lyap(adj):

    initial_state = uniform(-np.pi, np.pi, N)

    # ---------------------------------------------------------------
    cfilename = "jitced.so"
    start_time = time()
    if not os.path.exists(cfilename):

        eqs = kuramotos_f_II(adj)
        I = jitcode_lyap(eqs,
                         n=N,
                         n_lyap=n_lyap,
                         simplify=False,
                         )
        I.generate_f_C(chunk_size=chunk_size)
        I.compile_C(omp=USE_OMP)
        I.save_compiled(overwrite=True, destination=cfilename)

    else:
        I = jitcode_lyap(n=N,
                         n_lyap=n_lyap,
                         module_location=cfilename)
    print("{:10.4f} seconds".format(time() - start_time), "(compile)")

    I.set_integrator("RK45")
    I.set_initial_value(initial_state, 0.0)
    # ---------------------------------------------------------------

    times = np.arange(0, simulation_time, step)
    nstep = len(times)
    lyaps = np.zeros((nstep, n_lyap))

    # ---------------------------------------------------------------
    start_time = time()
    for iter in range(len(times)):
        lyaps[iter, :] = I.integrate(times[iter])[1]
    print("{:10.4f} seconds".format(time() - start_time), "(run)    ")    
    # ---------------------------------------------------------------

    for i in range(n_lyap):
        lyap = np.average(lyaps[100:, i])
        stderr = sem(lyaps[100:, i])
        print("%4d. Lyapunov exponent: % .4f +- %.4f" % (i+1, lyap, stderr))

# PARAMETERS --------------------------------------------------------
chunk_size = 100
USE_OMP = False

N = 500
g = 0.02
step = 0.5
n_lyap = 1
simulation_time = 500.0
transition_time = 250.0

omega = np.random.normal(0, 0.1, N)
adj = nx.to_numpy_array(nx.gnp_random_graph(N, 0.1, seed=1), dtype=int)

if __name__ == "__main__":

    np.random.seed(1)
    main_lyap(adj)

Ziaeemehr commented 3 years ago

using generator function slightly decreased the compilation time a memory usage (287 seconds and 2.3GB).

Wrzlprmft commented 3 years ago

A few suggestions or remarks:

Do you know whether the memory issues arise during or before compiling (compile_C)?
If before: Have you tried compiling more Lyapunov exponents with generator functions and see whether the memory consumption is increased?
If during: Have you tried using Clang as a compiler? It tends to respect chunking better than GCC.

If during: Did you try compiling without optimisation?

from jitcxde_common import DEFAULT_COMPILE_ARGS
…
I.compile_C( extra_compile_args = DEFAULT_COMPILE_ARGS + ["--my-O0"] )

Are you really sure you need 30–50 Lyapunov exponents? I fail to see how they can provide any relevant information and you will very likely run into underflow problems on account of some of these exponents being very low during runtime.

Ziaeemehr commented 3 years ago

The memory issue was during the compile_C.
Using generator function, compiling with clang and using -O0 optimization flag all reduced the memory usage and I did not get memory error.
GCC and -O0 did not.
Yes I do need even more number of Lyapunov exponents to compare the Lyapunov dimension. for example I used Wolf method (Determining Lyapunov exponents from a time series, Physica D 1985) for 200 nodes and all LEs.

The program calculate all LEs and is slow for large networks. The Lyapunov dimension here are different.

Selection_006

Selection_004

We can compare the largest LEs as a more convenient measure but I am not sure the largest LEs is always a good replacement of Lyapunov dimension index.

Wrzlprmft commented 3 years ago

Using generator function, compiling with clang and using -O0 optimization flag all reduced the memory usage and I did not get memory error.

Just being curious: Did each of these actions reduced the memory usage or did you need to do all of them at once?

We can compare the largest LEs as a more convenient measure but I am not sure the largest LEs is always a good replacement of Lyapunov dimension index.

If you just want an indicator for chaos, the largest Lyapunov exponent is certainly a more robust measure¹. Otherwise, it depends on your application. I am not aware of a good interpretation or other use of the Lyapunov dimension for such high-dimensional systems, but that doesn’t mean that none exists.

¹ More precisely: Be sure to discard transients. Then get a reasonable error margin for largest Lyapunov exponent, e.g., the standard error of the mean of local Lyapunov exponents at least one oscillation apart. Also, the second-largest Lyapunov exponent should be zero or larger.

Ziaeemehr commented 3 years ago

Thank you.

Just being curious: Did each of these actions reduced the memory usage or did you need to do all of them at once?

clang and -O0 are necessary and generator function helps. It seems gcc is not helpful for reducing the memory even with -O0. I am using gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 and clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final).

I think a convergence test helps here after dropping the transient behavior. The LEs are recorded to a buffer for m last steps. Then the standard deviation of each LE is calculated as the buffer full. If standard deviation of all LEs be smaller than a threshold it means the LEs converged to a value and don't change anymore. So the program terminate. Something like this:

def convergence_test(buffers, threshold, verbocity=False):

    #  BUFFER_LENGTH * n_lyap
    info = 0  # (0 and 1 for not converge and converge, respectively)

    lyap = np.mean(buffers, axis=0)
    st_dev = np.std(buffers, axis=0)

    # if verbocity:
    #     print("std is : {:.6f}".format(np.max(st_dev)))

    if (st_dev < threshold).all():
        info = 1
        if verbocity:
            print("Lyapunov exponents converged. {:.6f}".format(st_dev[0]))

    return lyap, info

info ==1 terminate the program and return the LEs.

The standard deviation of all LEs smaller than a threshold is hard condition, I guess for large networks probably the convergence of a few largest LEs is enough.

best,

neurophysik / jitcode

Memory error while compiling large system of equations #27