pygae / galgebra

Symbolic Geometric Algebra/Calculus package for SymPy :crystal_ball:
https://galgebra.rtfd.io/
BSD 3-Clause "New" or "Revised" License
227 stars 62 forks source link

Extremely slow Ga initialization in derivatives_in_elliptic_cylindrical_coordinates() #23

Open utensil opened 5 years ago

utensil commented 5 years ago

Extremely slow Ga initialization in derivatives_in_elliptic_cylindrical_coordinates()

a = symbols('a', real=True)
    coords = (u,v,z) = symbols('u v z', real=True)
    (elip3d,er,eth,ephi) = Ga.build('e_u e_v e_z',X=[a*cosh(u)*cos(v),a*sinh(u)*sin(v),z],coords=coords,norm=True)
    grad = elip3d.grad

The code above takes 500~600 seconds for the Ga.build call.

After some profiling/debugging I found it spent most of the time in simplify and trigsimp(both take ~50%) and every call to them cost ~16s of 100% CPU and there's approximately 27 calls to them, 27 * 16s = 513s, it explains the total time.

So which step is calling simplify and trigsimp? It turns out to be metric.Simp.apply() in build_reciprocal_basis():

        # Replace reciprocal basis vectors with expansion in terms of
        # basis vectors in derivatives of basis vectors

        if self.connect_flg:
            for x_i in self.n_range:
                for jb in self.n_range:
                    if not self.is_ortho:
                        self.de[x_i][jb] = metric.Simp.apply(self.de[x_i][jb].subs(self.r_basis_dict) / self.e_sq)
                    else:
                        self.de[x_i][jb] = metric.Simp.apply(self.de[x_i][jb].subs(self.r_basis_dict))

and derivatives_of_basis()

        # Christoffel symbols of the first kind, \Gamma_{ijk}

        for i in n_range:
            de_row = []
            for j in n_range:
                Gamma = []
                for k in n_range:
                    gamma = half * (dg[j][k][i] + dg[i][k][j] - dg[i][j][k])
                    Gamma.append(Simp.apply(gamma))
                de_row.append(sum([gamma * base for (gamma, base) in zip(Gamma, self.r_symbols)]))
            de.append(de_row)

Mostly the former.

utensil commented 5 years ago

I was profiling using the method described in https://devopedia.org/profiling-python-code and https://julien.danjou.info/guide-to-python-profiling-cprofile-concrete-case-carbonara/ .

python -m cProfile -o curvi_linear_latex.cprof curvi_linear_latex.py
python -m pyprof2calltree -i curvi_linear_latex.cprof -k

But KCacheGrind gave me puzzling results below simplify and trigsimp so I had to set break points in VS Code Python Debugger and binary search where the CPU goes up to 100% 😭 But eventually found the cause.

It seems to me that simplification at this stage is too early and the formula is unnecessarily complicated. Will take a look at the math process some time later. Just skip this example for now.

Julien Danjou
Profiling Python using cProfile: a concrete case
Writing programs is fun, but making them fast can be a pain. Python programs are no exception to that, but the basic profiling toolchain is actually not that complicated to use. Here, I would like to show you how you can quickly profile and analyze your Python code to find what part of the code you should optimize. What's profiling? Profiling a Python program is doing a dynamic analysis that measures the execution time of the program and everything that compose it. That means measuring the time
utensil commented 5 years ago

I've tried KCacheGrind again, it turns out that I was doing something wrong so the result was puzzling.

After I turn off "Detect Cycles" and tweaked the UI layout a little bit, it looks like this:

image

utensil commented 5 years ago

01 02

utensil commented 5 years ago

Could use https://github.com/airspeed-velocity/asv to benchmark and monitor performance issues.

GitHub
airspeed-velocity/asv
Airspeed Velocity: A simple Python benchmarking tool with web-based reporting - airspeed-velocity/asv