Closed ryan-david-murphy closed 1 year ago
Uploading so we don't lose this:
Assuming this is still an issue, could you try a separate fresh install (please download the latest version of the install script!) now that we have pinned Cython. It's possible that the slow init is a result of some packages being installed with latest Cython and some with older Cython.
I have not been able to reproduce this issue locally.
I have reinstalled using the updated firedrake-install script after I completely removed the previous venv. I have also uninstalled and reinstalled homebrew and then completed a further reinstallation. The same performance issue is present.
I have run helmholtz.py (with graph plotting removed) using both my M1 Max and a Linux Workstation (3 month+ old venv) for comparison. I have attached the profiles. They are usually of comparable performance.
Is there anything else I can reinstall to enable a fresh implementation?
If you update (or do a fresh install) on the Linux workstation do you also see the performance regression? If you don't want to risk losing the old performant venv you can use firedrake-install --venv-name somthing_unique
. If the Linux workstation is fine I will add the Mac tag and get some of our Mac developers to investigate.
I will say that the profiles do look very similar to a first run (doing code gen) vs second run (using cached code).
The Helmholtz example (in the demos directory) is also very small, only a 10x10 grid with CG1 elements. To get meaningful profiling data we need to increase the number of dofs. Maybe you could add some timings?
I have attached an example profiling test on my desktop along with its output for comparison:
test_script.sh:
#!/bin/bash
# Clean caches
firedrake-clean
# Create a minimal Helmholtz problem (without plotting)
cat <<EOF >minimal_helmholtz.py
from firedrake import *
mesh = UnitSquareMesh(10, 10)
V = FunctionSpace(mesh, "CG", 1)
u = TrialFunction(V)
v = TestFunction(V)
f = Function(V)
x, y = SpatialCoordinate(mesh)
f.interpolate((1+8*pi*pi)*cos(x*pi*2)*cos(y*pi*2))
a = (inner(grad(u), grad(v)) + inner(u, v)) * dx
L = inner(f, v) * dx
u = Function(V)
solve(a == L, u, solver_parameters={'ksp_type': 'cg', 'pc_type': 'none'})
File("helmholtz.pvd").write(u)
f.interpolate(cos(x*pi*2)*cos(y*pi*2))
print(sqrt(assemble(dot(u - f, u - f) * dx)))
EOF
# Time and profile minimal Helmholtz
echo "10x10 cold cache"
time python minimal_helmholtz.py -log_view :no_cache_profile.txt:ascii_flamegraph
flamegraph.pl no_cache_profile.txt > no_cache_profile.svg
# Time and profile minimal Helmholtz with hot cache
echo "10x10 hot cache"
time python minimal_helmholtz.py -log_view :hot_cache_profile.txt:ascii_flamegraph
flamegraph.pl hot_cache_profile.txt > hot_cache_profile.svg
# Increase problem size
sed -i "s/(10, 10)/(1000, 1000)/g" minimal_helmholtz.py
# Run bigger problem
echo "1000x1000 hot cache"
time python minimal_helmholtz.py -log_view :big_hot_cache_profile.txt:ascii_flamegraph
flamegraph.pl big_hot_cache_profile.txt > big_hot_cache_profile.svg
output:
$ ./test_script.sh
/home/jack/Documents/firedrake/firedrake/bin/firedrake-clean:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
__import__('pkg_resources').require('firedrake==0.13.0+5774.g3fb16ad47.dirty')
Removing cached TSFC kernels from /home/jack/Documents/firedrake/firedrake/.cache/tsfc
Removing cached PyOP2 code from /home/jack/Documents/firedrake/firedrake/.cache/pyop2
Removing cached pytools files from /home/jack/.cache/pytools
10x10 cold cache
0.06257073749110136
real 0m4.426s
user 0m4.085s
sys 0m0.333s
10x10 hot cache
0.06257073749110136
real 0m1.387s
user 0m1.218s
sys 0m0.155s
1000x1000 hot cache
7.078431517196732e-06
real 1m9.689s
user 0m41.024s
sys 0m28.647s
Cold cache: Hot cache: Big problem:
@rdm4317 any update?
@JDBetteridge I have run the requested profiles, here are the results:
Mac:
10x10 cold cache real 0m22.615s user 0m4.705s sys 0m2.675s
10x10 hot cache real 0m4.334s user 0m1.390s sys 0m0.868s
1000x1000 hot cache real 0m38.126s user 0m35.405s sys 0m1.232s
10x10 cold cache
10x10 hot cache
1000x1000 hot cache
---------------------------------------------------------------------------
|Package |Branch |Revision |Modified |
---------------------------------------------------------------------------
|FInAT |master |47f6c37 |False |
|PyOP2 |master |d230953b |False |
|fiat |master |8c66270 |False |
|firedrake |master |3fb16ad47 |False |
|h5py |firedrake |6cc4c912 |False |
|libspatialindex |master |4768bf3 |True |
|libsupermesh |master |b145b65 |False |
|loopy |main |8158afdb |False |
|petsc |firedrake |9364cb008b|False |
|pyadjoint |master |0378c81 |False |
|pytest-mpi |main |a478bc8 |False |
|slepc |firedrake |e438e4993 |False |
|tsfc |master |6f72c9c |False |
|ufl |master |3c62318c |False |
Linux WS:
10x10 cold cache real 0m6.582s user 0m5.707s sys 0m0.884s
10x10 hot cache real 0m2.300s user 0m1.768s sys 0m0.554s
1000x1000 hot cache real 0m51.228s user 0m49.202s sys 0m1.973s
10x10 cold cache
10x10 hot cache
1000x1000 hot cache
---------------------------------------------------------------------------
|Package |Branch |Revision |Modified |
---------------------------------------------------------------------------
|COFFEE |master |70c1e66 |False |
|FInAT |master |cd1d528 |False |
|PyOP2 |master |59e109eb |False |
|fiat |master |a305398 |False |
|firedrake |master |284a1104a |False |
|h5py |firedrake |6cc4c912 |False |
|libspatialindex |master |4768bf3 |True |
|libsupermesh |master |69012e5 |False |
|loopy |main |3988272b |False |
|petsc |firedrake |9364cb008b|False |
|pyadjoint |master |c691737 |False |
|pytest-mpi |main |a478bc8 |False |
|tsfc |master |e68bd28 |False |
|ufl |master |772485d7 |False |
---------------------------------------------------------------------------
For a simple hyperelasticity example, I am getting different TSFC behaviours.
Mac:
0
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
1
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
2
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
3
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
4
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
5
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
6
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
7
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
8
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
9
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
Linux:
0
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
1
2
3
4
5
6
7
8
9
Here is the code:
from firedrake import *
spatialDimensions = 2
lx = 8
ly = 1
nx = 320
ny = 40
mesh = RectangleMesh(nx, ny, lx, ly, quadrilateral=True)
# function spaces
A = FunctionSpace(mesh, "CG", 1)
P = VectorFunctionSpace(mesh, "CG", 1)
# boundary conditions
bcs = [DirichletBC(P.sub(0), Constant(0), 1),
DirichletBC(P.sub(1), Constant(0), 1)]
# Define functions
du = TrialFunction(P) # Incremental displacement
v = TestFunction(P) # Test function
u = Function(P) # Displacement from previous iteration
B = Constant((0.0, -0.0)) # Body force per unit volume
T = Constant((0.1, 0.0)) # Traction force on the boundary
for i in range(10):
print(i)
# Kinematics
I = Identity(2) # Identity tensor
F = I + grad(u) # Deformation gradient
C = F.T*F # Right Cauchy-Green tensor
# Invariants of deformation tensors
Ic = tr(C)
J = det(F)
# Elasticity parameters
E, nu = 10.0, 0.3
mu, lmbda = Constant(E/(2*(1 + nu))), Constant(E*nu/((1 + nu)*(1 - 2*nu)))
# Stored strain energy density (compressible neo-Hookean model)
psi = (mu/2)*(Ic - 3) - mu*ln(J) + (lmbda/2)*(ln(J))**2
# Total potential energy
Pi = psi*dx - dot(B, u)*dx - dot(T, u)*ds(2)
# Compute first variation of Pi (directional derivative about u in the direction of v)
F = derivative(Pi, u, v)
# Compute Jacobian of F
J = derivative(F, u, du)
# Solve variational problem
problem = NonlinearVariationalProblem(F, u, bcs=bcs, J=J)
solver = NonlinearVariationalSolver(problem)
solver.solve()
I reproduced this execution with test.sh at M2 Mac. See the results:
10x10 cold cache real 0m7.179s user 0m4.232s sys 0m1.043s
10x10 hot cache real 0m2.460s user 0m1.421s sys 0m0.542s
1000x1000 hot cache real 0m38.433s user 0m35.422s sys 0m2.266s
10x10 cold cache
10x10 hot cache
1000x1000 hot cache
I had tsfc:WARNING
only once.
My intel Mac Monterey 12.4 (Fresh install):
10x10 cold cache
0.06257073749110047
real 0m21.021s
user 0m9.369s
sys 0m4.045s
10x10 hot cache
0.06257073749110047
real 0m9.233s
user 0m4.369s
sys 0m2.088s
1000x1000 hot cache
7.078429874707133e-06
real 1m36.637s
user 1m31.152s
sys 0m3.195s
10x10 no cache: 10x10 hot cache: 1000x1000 hot cache:
Hyperelasticity example:
tsfc warnings at every step
firedrake-status
:
---------------------------------------------------------------------------
|Package |Branch |Revision |Modified |
---------------------------------------------------------------------------
|FInAT |master |47f6c37 |False |
|PyOP2 |master |d230953b |False |
|fiat |master |8c66270 |False |
|firedrake |master |0ec02b2d8 |False |
|h5py |firedrake |6cc4c912 |False |
|libspatialindex |master |4768bf3 |True |
|libsupermesh |master |b145b65 |False |
|loopy |main |8158afdb |False |
|petsc |firedrake |9364cb008b|False |
|pyadjoint |master |0378c81 |False |
|pytest-mpi |main |a478bc8 |False |
|tsfc |master |6f72c9c |False |
|ufl |master |3c62318c |False |
---------------------------------------------------------------------------
On my linux machine (Fresh install):
10x10 no cache: 10x10 hot cache: 1000x1000 hot cache:
Hyperelasticity example:
tsfc warnings at every step
firedrake-status
:
---------------------------------------------------------------------------
|Package |Branch |Revision |Modified |
---------------------------------------------------------------------------
|FInAT |master |47f6c37 |False |
|PyOP2 |master |d230953b |False |
|fiat |master |8c66270 |False |
|firedrake |master |0ec02b2d8 |False |
|h5py |firedrake |6cc4c912 |False |
|libspatialindex |master |4768bf3 |True |
|libsupermesh |master |b145b65 |False |
|loopy |main |8158afdb |False |
|petsc |firedrake |9364cb008b|False |
|pyadjoint |master |0378c81 |False |
|pytest-mpi |main |a478bc8 |False |
|tsfc |master |6f72c9c |False |
|ufl |master |3c62318c |False |
---------------------------------------------------------------------------
I see tsfc warning at each step both on my mac and on my Linux machine. It looks more like an issue of the latest Firedrake than macos vs. Linux to me.
Can everyone please put the output of firedrake-status
below your test result?
@ksagiyam I have updated my post with this output
|Package |Branch |Revision |Modified |
---------------------------------------------------------------------------
|COFFEE |master |70c1e66 |False |
|FInAT |master |47f6c37 |False |
|PyOP2 |master |d230953b |False |
|fiat |master |8c66270 |False |
|firedrake |master |0ec02b2d8 |False |
|h5py |firedrake |6cc4c912 |False |
|libspatialindex |master |4768bf3 |True |
|libsupermesh |master |b145b65 |False |
|loopy |main |8158afdb |False |
|petsc |firedrake |9364cb008b|False |
|pyadjoint |master |0378c81 |False |
|pytest-mpi |main |a478bc8 |False |
|tsfc |master |6f72c9c |False |
|ufl |master |3c62318c |False |
---------------------------------------------------------------------------
Testing on my Linux machine indicates that this PR on Constant https://github.com/firedrakeproject/firedrake/pull/2927 somehow broke the caching. (Firedrake + PyOP2 + tsfc)
I used the above hyperelasticity problem as an example.
Right before https://github.com/firedrakeproject/firedrake/pull/2927:
---------------------------------------------------------------------------
|Package |Branch |Revision |Modified |
---------------------------------------------------------------------------
|COFFEE |master |70c1e66 |False |
|FInAT |master |47f6c37 |False |
|PyOP2 |HEAD |edae2884 |False |
|fiat |master |8c66270 |False |
|firedrake |HEAD |be82caf4e |False |
|h5py |firedrake |6cc4c912 |False |
|libspatialindex |master |4768bf3 |True |
|libsupermesh |master |b145b65 |False |
|loopy |main |8158afdb |False |
|petsc |firedrake |9364cb008b|False |
|pyadjoint |master |0378c81 |False |
|pytest-mpi |main |a478bc8 |False |
|tsfc |HEAD |ef39f72 |False |
|ufl |master |3c62318c |False |
---------------------------------------------------------------------------
Cold cache:
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
WARNING:tsfc:Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
1
2
3
Hot cache:
0
1
2
3
Right after https://github.com/firedrakeproject/firedrake/pull/2927:
---------------------------------------------------------------------------
|Package |Branch |Revision |Modified |
---------------------------------------------------------------------------
|COFFEE |master |70c1e66 |False |
|FInAT |master |47f6c37 |False |
|PyOP2 |HEAD |d230953b |False |
|fiat |master |8c66270 |False |
|firedrake |HEAD |34f930dd9 |False |
|h5py |firedrake |6cc4c912 |False |
|libspatialindex |master |4768bf3 |True |
|libsupermesh |master |b145b65 |False |
|loopy |main |8158afdb |False |
|petsc |firedrake |9364cb008b|False |
|pyadjoint |master |0378c81 |False |
|pytest-mpi |main |a478bc8 |False |
|tsfc |HEAD |83dd8aa |False |
|ufl |master |3c62318c |False |
---------------------------------------------------------------------------
Cold cache:
0
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
1
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
2
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
3
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
Hot cache:
0
1
2
3
Hey @ksagiyam, did you work out how to fix this?
Sorry I have been on holiday for the past two weeks so haven't seen this. I think that this is a known performance problem with the recent changes to how we use Constants
. Could you check whether using Firedrake branch connorjward/fix-constant-numbering
and UFL branch connorjward/counted-mixin
makes these go away? I already have associated PRs (Firedrake, UFL) for getting these fixes in.
Yes, those branches at least fix the problem stated above.
Cold cache:
0
tsfc:WARNING Estimated quadrature degree 14 more than tenfold greater than any argument/coefficient degree (max 1)
1
2
3
Hot cache:
0
1
2
3
Closing this issue as I believe it is fixed by https://github.com/firedrakeproject/firedrake/pull/3011. Please reopen it if this is not the case.
Thanks @connorjward, I have just updated my installation and the performance is much improved.
Description: I encountered a performance regression after successfully installing Firedrake. Although the installation was completed without errors (after pinning the cython version to 0.29.36), the performance has noticeably dropped.
Steps to Reproduce:
Expected Behaviour: The performance should be consistent or improved compared to the previous environment.
Actual Behavior: The performance has significantly dropped after installing Firedrake.
Environment: Operating System: MacOS 13.4.1 Python Version: 3.10.8 Firedrake Version: 0.13.0+5767.g32bda80fc