Closed gdmcbain closed 4 years ago
Thanks a lot for keeping the discussion organized. Will open a separate issue (so that the main topic remains as is) for each topic/sub-topic in the future :)
Usually these kinds of operations are just done with NumPy, e.g. numpy.einsum
. The physical indices (indexing over component or dimension) usually precede the numerical indices (indexing over element or quadrature point). Most of the NumPy operations do the right thing in broadcasting over the numerical indices, but, not all of them. I remember having a difficulty with one of them recently...
Yeah, does numpy.linalg.inv
work if the ndarray indices are shuffled around so that the ones over component are at the end?
Indeed it does! (never knew numpy.linalg.inv could handle higher dimensional arrays). So this means that inverse would then be:
from numpy import linalg as nla
def inv(T):
reshapedT = np.einsum("ijk->kij",T) # assuming that T is of shape `ndim x ndim x numQuadraturePoints` ?
return np.einsum("ijk->kij", nla.inv(reshapedT))
and I think det
does too
Please note that since this is a nonlinear problem and we don't (yet?) have anything similar to derivative
in FEniCS you have to either
In the past, I have done options 1 and 2 for hyperelasticity and tried option 3 for a simpler test problem. At some point I had a goal of encapsulating option 3 into scikit-fem
so that it's simple to use but it is currently on hold as I've been focusing on other things.
Do Jacobian-free Newton–Krylov methods work well on this class of.problems?
Do Jacobian-free Newton–Krylov methods work well on this class of.problems?
Oh yes, that's another option. You might need to do some work though to find a good preconditioner but I suppose small problems should work pretty well?
- calculate the Jacobian of the Piola-Kirchhoff stress by hand
This was my plan. I had done this previously for a simple hyperelasticity problem sometime back. The idea was to use SymPy to calculate the derivatives with respect to the invariants and then lambdify
the the result.
- try to apply something like JAX to do it in an automatized manner
Interesting. Although I could never manage to install jax
on a windows machine in the past. Might be worthwhile to follow up on this.
You might need to do some work though to find a good preconditioner but I suppose small problems should work pretty well?
Yes, I think so. A (good or otherwise) preconditioner is not a strict requirement for Jacobian-free Newton–Krylov, as shown for ex10 when demonstrating scipy.optimize.root
as a nonlinear solver #119. That example is a small problem but root(method='krylov')
works really well for it. I think this is just using unpreconditioned lgmres. I think JFNK is a good first option for preliminary exploration of nonlinear problems on small meshes. I'll post the branch.
@gdmcbain @kinnala , so I got some time today to start re-implementing the FEniCS demo in scikit-fem
I just wanted to make sure that I understand the syntax correctly to assemble the rhs and jacobian. For now (with the analytical jacobian), I have
import pygmsh as pm
import numpy as np
from skfem.io import from_meshio
from skfem.helpers import ddot, grad, eye, dot, transpose
from math import pi
from skfem import *
def vinv(T):
reshapedT = np.einsum("ijk->kij",T)
return np.einsum("ijk->kij", np.linalg.inv(reshapedT))
def vdet(T):
reshapedT = np.einsum("ijk->kij",T)
return np.einsum("ijk->kij", np.linalg.det(reshapedT))
def firstPKStress(u):
F = eye(1., u.shape[0]) + grad(u)
J = vdet(F)
return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))
# TODO: check the expression once again
def jacobianPK(u):
F = eye(1., u.shape[0]) + grad(u)
J = vdet(F)
Finv = vinv(F)
dFdF = np.einsum("ik...,jl...->ikjl...", eye(1., u.shape[0]), eye(1., u.shape[0]))
dFinvdF = -np.einsum("ik...,jl...->ijkl",Finv, Finv)
C = mu * dFdF - mu * dFinvdF - lmbda * J * (J- 1) * dFinvdF + (2*J - 1) * J * np.einsum("ji...,lk...->ijkl",Finv, Finv)
return C
geom = pm.opencascade.Geometry(1.e-1, 5.e-1)
box = geom.add_box([0., 0., 0.], [1., 1., 1.])
msh = pm.generate_mesh(geom, dim=3)
# import mesh in skfem
mesh = from_meshio(msh)
elem = ElementTetP1()
uelem = ElementVectorH1(elem)
iBasis = InteriorBasis(mesh, uelem, intorder=3)
fBasis = FacetBasis(mesh, uelem, intorder=3)
u = np.zeros(iBasis.N) # this takes care of dimension
# materialParams and init
bodyForce = np.array([0., -1./2, 0])
E, nu = 10., 0.3
mu = E/2/(1+nu)
lmbda = 2*mu*nu/(1-2*nu)
dofs = {
"left": fBasis.get_dofs(lambda x: np.isclose(x[0], 0.)),
"right": fBasis.get_dofs(lambda x: np.isclose(x[0], 1.))
}
I = iBasis.complement_dofs(dofs)
# assign DirichletBC
# variables used in the FEniCS demo
scale = y0 = z0 = 0.5
theta = pi/3.
u1Right = 0.
u2Right = lambda x,y,z: scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u3Right = lambda x,y,z: scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)
u[dofs["left"].nodal['u^1']] = 0.
u[dofs["left"].nodal['u^2']] = 0.
u[dofs["left"].nodal['u^3']] = 0.
u[dofs["right"].nodal['u^1']] = u1Right
u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, fBasis, dofs["right"].nodal['u^2'])
u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, fBasis, dofs["right"].nodal['u^3'])
@LinearForm
def rhs(v, w):
return ddot(firstPKStress(v), grad(w["w"])) #+ dot(bodyForce, w["w"])
@BilinearForm
def jac(u, v, w):
return ddot(ddot(grad(v), jacobianPK(u)), grad(w["w"]))
for itr in range(100):
w = iBasis.interpolate(u)
J = asm(jac, iBasis, w=w) # not able to assemble properly
F = asm(rhs, iBasis, w=w)
u_prev = u.copy()
u += solve(*condense(J, -F, I=I)) # try a full Newton Step
if np.linalg.norm(u - u_prev) < 1e-8:
break
if __name__ == "__main__":
print(np.linalg.norm(u - u_prev))
I think I am missing something when trying to correctly declare the linear and bilinear forms. I will probably take a fresh look later in the day tomorrow at how eye
works. However, if you happen to spot something obvious, please let me know. Thanks!
This is the traceback,
F = eye(1., u.shape[0]) + grad(u)
AttributeError: 'DiscreteField' object has no attribute 'shape'
I'm currently writing some additional Sphinx documentation on the subject "Anatomy of forms". I hope to add some tips for defining forms.
AttributeError: 'DiscreteField' object has no attribute 'shape'
A short explanation: parameters u
and v
in the form definition are DiscreteField
objects, basically a tuple of ndarray
's: https://github.com/kinnala/scikit-fem/blob/master/skfem/element/discrete_field.py#L7. Anyhow, they don't have a property shape
.
Edit: I'd have to take a more careful look at the forms to propose a fix. I can look at it later when I have more time.
So mathematically grad(u)
is 2-tensor and you want to add ones to the diagonal.
You can get forwards by replacing
F = eye(1., u.shape[0]) + grad(u)
with
F = grad(u)
F[0, 0] += 1
F[1, 1] += 1
F[2, 2] += 1
I think skfem.helpers.eye
is probably not the most useful function for this use case in it's current form. I think we need to consider replacing it with something better. I cannot recall why it was written that way.
Maybe a more relevant helper for this purpose would be
def eye(u):
v = np.zeros_like(u)
for i in range(v.shape[0]):
v[i, i] += 1
return v
But this is good discussion also on what kind of helpers we should have in skfem.helpers
and how to improve it.
Thanks a lot for the help! I added the ellipsis to a couple of more places (after printing grad(u).shape
)
def vinv(T):
reshapedT = np.einsum("ij...->...ij",T)
return np.einsum("...ij->ij...", np.linalg.inv(reshapedT))
and the same for determinant. And also changed to using your second suggestion.
def firstPKStress(u):
F = grad(u)
for i in range(mesh.p.shape[0]):
F[i,i] += 1.
J = vdet(F)
return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))
and in the jacobian. Everything seems to be fine at least syntax-wise (maybe I am missing something). Will take a careful look at the forms and bcs later in the day. Thanks so much for your help!!
Corrected (?) the forms (still haven't implemented the non-homogeneous Neumann condition). Will clean up the code in some time.
def firstPKStress(u):
F = grad(u)
for i in range(3):
F[i,i] += 1.
J = vdet(F)
return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))
def jacobianPK(u):
F = grad(u)
eye = np.zeros_like(F)
for i in range(3):
F[i,i] += 1.
eye[i,i] += 1.
J = vdet(F)
Finv = vinv(F)
dFdF = np.einsum("ik...,jl...->ijkl...", eye, eye)
dFinvdF = np.einsum("jk...,li...->ijkl...", Finv, Finv)
C = mu * dFdF + mu * dFinvdF
- lmbda * J * (J- 1) * dFinvdF
+ lmbda * (2*J - 1) * J * np.einsum("ji...,lk...->ijkl...",Finv, Finv)
return C
@LinearForm
def rhs(v, w):
return ddot(firstPKStress(w["w"]), grad(v)) + dot(bodyForce, v)
@BilinearForm
def jac(u, v, w):
return ddot(ddot(grad(u), jacobianPK(w["w"])), grad(v))
However I noticed that assembling the bilinear form with ~7k dofs (precisely Points: 2290, Cells: 10191
) takes about 3min on my laptop (will do precise profiling and report in some time). I suspect I may be doing something inefficiently in the jacobianPK
function that affects the assembly.
assembling the bilinear form with ~7k dofs (precisely Points: 2290, Cells: 10191) takes about 3min on my laptop
Can you post your full code including vdet
and vinv
so we can check it out?
The implementation of vinv
looks a bit suspicious to me as I'm not sure how the dimensions would go if you wanted to use np.linalg.inv
for the inversion. Another option is to write the inverse explicitly using Cramer's rule as is done here: https://github.com/kinnala/scikit-fem/blob/master/skfem/mapping/mapping_affine.py#L49.
assembling the bilinear form with ~7k dofs (precisely Points: 2290, Cells: 10191) takes about 3min on my laptop
Can you post your full code including
vdet
andvinv
so we can check it out?
Sure (and thanks!). It is basically the full snippet that I posted above with the changes to the functions:
import pygmsh as pm
import numpy as np
from skfem.io import from_meshio
from skfem.helpers import ddot, grad, dot, transpose
from math import pi
from skfem import *
# define inverse:
def vinv(T):
"""
physical dimensions precede numerical (see: https://github.com/kinnala/scikit-fem/issues/439#issuecomment-661649339)
T_ij... (where i, j are physical dimensions)
"""
reshapedT = np.einsum("ij...->...ij", T)
return np.einsum("...ij->ij...", np.linalg.inv(reshapedT))
def vdet(T):
"""
physical dimensions precede numerical (see: https://github.com/kinnala/scikit-fem/issues/439#issuecomment-661649339)
T_ij... (where i, j are physical dimensions)
"""
reshapedT = np.einsum("ij...->...ij", T)
return np.einsum("...ij->ij...", np.linalg.det(reshapedT))
def firstPKStress(u):
F = grad(u)
F[0,0] += 1.
F[1,1] += 1.
F[2,2] += 1.
J = vdet(F)
return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))
def jacobianPK(u):
F = grad(u)
eye = np.zeros_like(F)
for i in range(3):
F[i,i] += 1.
eye[i,i] += 1.
J = vdet(F)
Finv = vinv(F)
dFdF = np.einsum("ik...,jl...->ijkl...", eye, eye)
dFinvdF = np.einsum("jk...,li...->ijkl...", Finv, Finv)
C = mu * dFdF + mu * dFinvdF -\
lmbda * J * (J- 1) * dFinvdF +\
lmbda * (2*J - 1) * J * np.einsum("ji...,lk...->ijkl...",Finv , Finv)
# print(C[0,0,0,0].min())
return C
geom = pm.opencascade.Geometry(1.e-2, 1.e-1)
box = geom.add_box([0., 0., 0.], [1., 1., 1.])
msh = pm.generate_mesh(geom)
mesh = from_meshio(msh)
elem = ElementTetP1()
uelem = ElementVectorH1(elem)
iBasis = InteriorBasis(mesh, uelem, intorder=2)
fBasis = FacetBasis(mesh, uelem, intorder=2)
u = np.zeros(iBasis.N) #this takes care of dimension
# materialParams and init
bodyForce = np.array([0., -1./2, 0])
E, nu = 10., 0.3
mu = E/2/(1+nu)
lmbda = 2*mu*nu/(1-2*nu)
dofs = {
"left": iBasis.get_dofs(lambda x: x[0]==0),
"right": iBasis.get_dofs(lambda x: x[0]==1.)
}
# assign DirichletBC
# variables used in the FEniCS demo
scale = y0 = z0 = 0.5
theta = pi/3.
# scaling factor: bta: for Newton's method'
bta = 1.
u1Right = 0.
u2Right = lambda x,y,z: scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u3Right = lambda x,y,z: scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)
rightNodes = mesh.p[:,mesh.nodes_satisfying(lambda x: np.isclose(x[0], 1.))]
leftNodes = mesh.p[:,mesh.nodes_satisfying(lambda x: np.isclose(x[0], 0.))]
u[dofs["left"].nodal['u^1']] = 0.
u[dofs["left"].nodal['u^2']] = 0.
u[dofs["left"].nodal['u^3']] = 0.
u[dofs["right"].nodal['u^1']] = 0.1
# u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2']) # u2Right(rightNodes[0], rightNodes[1], rightNodes[2])
# u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3']) # u3Right(rightNodes[0], rightNodes[1], rightNodes[2])
I = iBasis.complement_dofs(dofs)
@LinearForm
def rhs(v, w):
return ddot(firstPKStress(w["w"]), grad(v)) #+ dot(bodyForce, v)
@BilinearForm
def jac(u, v, w):
return ddot(grad(v), ddot(jacobianPK(w["w"]), grad(u)))
for itr in range(2): #100
print("Iteration: {}".format(itr+1))
w = iBasis.interpolate(u)
# print(w)
J = asm(jac, iBasis, w=w)
F = asm(rhs, iBasis, w=w)
u_prev = u.copy()
u += bta * solve(*condense(J, -F, I=I))
print(np.linalg.norm(u_prev-u))
print("lmbda + 2mu = {}".format(lmbda+2.*mu ))
You are right, there is something off with the calculation of the jacobian
, and most likely it is in how inverse is calculated. But I also checked the deformation gradient and it seems to be off too.
Will do some thorough checks today in the evening.
Thanks again
The implementation of
vinv
looks a bit suspicious to me as I'm not sure how the dimensions would go if you wanted to usenp.linalg.inv
for the inversion. Another option is to write the inverse explicitly using Cramer's rule as is done here: https://github.com/kinnala/scikit-fem/blob/master/skfem/mapping/mapping_affine.py#L49.
I remember stumbling accross this at the very beginning when I was looking for helper functions, but then the documentation in numpy.linalg.inv
seemed to suggest that it will work when the physical indices are shifted to the end. I will check the explicit inversion (and try it too). Thanks!
I tried doing a small check:
import numpy as np
a = np.random.random((3,3,100,4))
ainvVec = np.einsum("...ij->ij...", np.linalg.inv(np.einsum("ij...->...ij", a)))
ainvLooped = np.array([
[np.linalg.inv(a[:,:,i,j]) for j in range(a.shape[3])]
for i in range(a.shape[2])
])
print(np.allclose(np.einsum("...ij->ij...", ainvLooped), ainvVec)) # returns `True`
Here is the time on my laptop. I'll check quickly if there is something I can do to improve it.
In [2]: %time asm(jac, iBasis, w=w)
CPU times: user 17.9 s, sys: 16.1 s, total: 33.9 s
Wall time: 11.3 s
Out[2]:
<3603x3603 sparse matrix of type '<class 'numpy.float64'>'
with 135385 stored elements in Compressed Sparse Row format>
Thanks a lot! I had previously done some vectorized assembly calculations (with loop over number of local basis) and remember that the time wasn't that huge. But that was a linear problem in 2D.
Here is something:
def vdet(A):
detA = np.zeros_like(A[0, 0])
detA = A[0, 0] * (A[1, 1] * A[2, 2] -
A[1, 2] * A[2, 1]) -\
A[0, 1] * (A[1, 0] * A[2, 2] -
A[1, 2] * A[2, 0]) +\
A[0, 2] * (A[1, 0] * A[2, 1] -
A[1, 1] * A[2, 0])
return detA
def vinv(A):
invA = np.zeros_like(A)
detA = vdet(A)
invA[0, 0] = (-A[1, 2] * A[2, 1] +
A[1, 1] * A[2, 2]) / detA
invA[1, 0] = (A[1, 2] * A[2, 0] -
A[1, 0] * A[2, 2]) / detA
invA[2, 0] = (-A[1, 1] * A[2, 0] +
A[1, 0] * A[2, 1]) / detA
invA[0, 1] = (A[0, 2] * A[2, 1] -
A[0, 1] * A[2, 2]) / detA
invA[1, 1] = (-A[0, 2] * A[2, 0] +
A[0, 0] * A[2, 2]) / detA
invA[2, 1] = (A[0, 1] * A[2, 0] -
A[0, 0] * A[2, 1]) / detA
invA[0, 2] = (-A[0, 2] * A[1, 1] +
A[0, 1] * A[1, 2]) / detA
invA[1, 2] = (A[0, 2] * A[1, 0] -
A[0, 0] * A[1, 2]) / detA
invA[2, 2] = (-A[0, 1] * A[1, 0] +
A[0, 0] * A[1, 1]) / detA
return invA, detA
and then use Finv, J = vinv(F)
in jacobianPK
. It gives
In [1]: %timeit asm(jac, iBasis, w=w)
3.89 s ± 491 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
unless I made some grave mistake.
The rest of the time comes mainly from moving data between low-level and high-level code because this line has quite a bit of operations:
C = mu * dFdF + mu * dFinvdF -\
lmbda * J * (J- 1) * dFinvdF +\
lmbda * (2*J - 1) * J * np.einsum("ji...,lk...->ijkl...",Finv , Finv)
A package called numexpr
might help here but not sure.
Btw, I think
@BilinearForm
def jac(u, v, w):
return ddot(grad(v), ddot(jacobianPK(w["w"]), grad(u)))
is not what you're thinking: ddot
is not detecting that jacobianPK
is 4-tensor. I'll try to propose a fix.
Maybe this is right?
@BilinearForm
def jac(u, v, w):
return np.einsum('ijkl...,ij...,kl...', jacobianPK(w["w"]), grad(u), grad(v))
I will check these. Really appreciate your help on this.
Maybe this is right?
@BilinearForm def jac(u, v, w): return np.einsum('ijkl...,ij...,kl...', jacobianPK(w["w"]), grad(u), grad(v))
This is exactly what I had in mind when trying to write the form. I didn't check, even though I checked the skfem.helpers.py
file, ddot
carefully enough. Thanks so much
The rest of the time comes mainly from moving data between low-level and high-level code because this line has quite a bit of operations:
I see, Thanks!. Will also try experimenting with the optimize flag in einsum
as well as comparing with opt_einsum
I implemented the changes in the code, and it seems to work. This is without the body forces and surface tractions (purely Dirichlet conditions). Will post the complete (translated) demo on a refined mesh once I implement surface tractions (non-homogeneous Neumann conditions) with some sanity checks.
I am currently running one with ~21k dofs (original system), and it takes a while to assemble. Will keep experimenting with this over the weekend. Hopefully next week will complete fine-tuning this as well as moving over to incompressible elasticity.
So I tried again with explicitly assembling the Jacobian this time with opt_einsum
's contract (hoping to get some speed up) but the results don't change much (einsum itself is optimized I think). Assembly still being the bottleneck, with Newton also stalling after ~30 iterations (I tried full as well as damped newton).
Do Jacobian-free Newton–Krylov methods work well on this class of.problems?
Thanks a lot! This was a great suggestion. It seems to be performing significantly faster (and rightly so) than vanilla Newton
solve that I tried above (for larger problems). I replaced the newton iteration loop with the following (as mentioned in #442 )
def nonlinearResidual(u: np.ndarray) -> np.ndarray:
res = asm(rhs, iBasis, w=iBasis.interpolate(u))
res[D] = 0.
return res
u = root(nonlinearResidual, u, method='krylov', options={
'disp': True, "line_search": "wolfe",
}).x
import pygmsh as pm
import numpy as np
from skfem.io import from_meshio
from skfem.helpers import ddot, grad, dot, transpose
from math import pi
from numba import jit
from opt_einsum import contract
import warnings, meshio
from scipy.optimize.nonlin import BroydenFirst, KrylovJacobian, InverseJacobian
# warnings.filterwarnings('ignore')
from skfem import *
# define inverse:
@jit(cache=True, nopython=True, nogil=True, parallel=True)
def vdet(A):
detA = np.zeros_like(A[0, 0])
detA = A[0, 0] * (A[1, 1] * A[2, 2] -
A[1, 2] * A[2, 1]) -\
A[0, 1] * (A[1, 0] * A[2, 2] -
A[1, 2] * A[2, 0]) +\
A[0, 2] * (A[1, 0] * A[2, 1] -
A[1, 1] * A[2, 0])
return detA
@jit(cache=True, nopython=True, nogil=True, parallel=True)
def vinv(A):
invA = np.zeros_like(A)
detA = vdet(A)
invA[0, 0] = (-A[1, 2] * A[2, 1] +
A[1, 1] * A[2, 2]) / detA
invA[1, 0] = (A[1, 2] * A[2, 0] -
A[1, 0] * A[2, 2]) / detA
invA[2, 0] = (-A[1, 1] * A[2, 0] +
A[1, 0] * A[2, 1]) / detA
invA[0, 1] = (A[0, 2] * A[2, 1] -
A[0, 1] * A[2, 2]) / detA
invA[1, 1] = (-A[0, 2] * A[2, 0] +
A[0, 0] * A[2, 2]) / detA
invA[2, 1] = (A[0, 1] * A[2, 0] -
A[0, 0] * A[2, 1]) / detA
invA[0, 2] = (-A[0, 2] * A[1, 1] +
A[0, 1] * A[1, 2]) / detA
invA[1, 2] = (A[0, 2] * A[1, 0] -
A[0, 0] * A[1, 2]) / detA
invA[2, 2] = (-A[0, 1] * A[1, 0] +
A[0, 0] * A[1, 1]) / detA
return invA, detA
def firstPKStress(u):
F = np.zeros_like(grad(u))
F[0,0] += 1.
F[1,1] += 1.
F[2,2] += 1.
F += grad(u)
Finv, J = vinv(F)
return mu * F - mu * transpose(Finv) + lmbda * J * (J - 1) * transpose(Finv)
geom = pm.opencascade.Geometry(1.e-2, 5.e-2) #5.e-3, 1.e-2
box = geom.add_box([0., 0., 0.], [1., 1., 1.])
msh = pm.generate_mesh(geom, geo_filename="box.geo")
mesh = from_meshio(msh)
elem = ElementTetP1()
uelem = ElementVectorH1(elem)
iBasis = InteriorBasis(mesh, uelem, intorder=2)
fBasis = FacetBasis(mesh, uelem, intorder=2)
u = np.zeros(iBasis.N) #this takes care of dimension
# materialParams and init
bodyForce = np.array([0., -1./2, 0])
surfaceTraction = np.array([0.1, 0, 0])
E, nu = 10., 0.3
mu = E/2/(1+nu)
lmbda = 2*mu*nu/(1-2*nu)
dofs = {
"left": iBasis.get_dofs(lambda x: x[0]==0),
"right": iBasis.get_dofs(lambda x: x[0]==1.)
}
# assign DirichletBC
# variables used in the FEniCS demo
scale = y0 = z0 = 0.5
theta = pi/3.
u1Right = 0.
u2Right = lambda x,y,z: scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u3Right = lambda x,y,z: scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)
u[dofs["left"].nodal['u^1']] = 0.
u[dofs["left"].nodal['u^2']] = 0.
u[dofs["left"].nodal['u^3']] = 0.
u[dofs["right"].nodal['u^1']] = 0.
u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2'])
u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3'])
I = iBasis.complement_dofs(dofs)
D = iBasis.complement_dofs(I)
# scaling factor: bta: for Newton's method'
bta = 0.7
@LinearForm
def rhs(v, w):
return ddot(firstPKStress(w["w"]), grad(v)) #+ dot(bodyForce, v)
def nonlinearResidual(u: np.ndarray) -> np.ndarray:
res = asm(rhs, iBasis, w=iBasis.interpolate(u))
res[D] = 0.
return res
u = root(nonlinearResidual, u, method='krylov', options={
'disp': True, "line_search": "wolfe",
}).x
meshio.xdmf.write("trialFEniCS.xdmf",meshio.Mesh(
mesh.p.T, cells={"tetra":mesh.t.T}, point_data={"u":u[iBasis.nodal_dofs].T}
))
Eventually even this stalls
...
...
33: |F(x)| = 0.0559137; step 0.01
34: |F(x)| = 0.0554543; step 0.01
35: |F(x)| = 0.0549778; step 0.01
36: |F(x)| = 0.0547684; step 0.01
37: |F(x)| = 0.0551101; step 0.01
38: |F(x)| = 0.0554195; step 0.01
39: |F(x)| = 0.0557679; step 0.01
I will keep iterating on this (especially since the solver could be fine-tuned) but it seems that overall scikit-fem can very well handle nonlinear variational problems in mechanics dealing with minimization of non-quadratic functionals to the point where numpy itself runs out of options.
I will also look into installation of num_exp
as it did not seem to be distinctly clear (I tried cloning and the usual python3 setup.py build
technique)
Thanks for your help!
Very interesting. Thank you for following up and posting this. I'm in the country for a few days so can't go into detail, but in brief, beside's kinnala's suggestion of the importance of preconditioning, I wonder whether it's a question of the basin of convergence. Perhaps the problem might be revised to be less nonlinear and then the original problem chased by natural parameter continuation. I used the external pacopy.natural
for this for the steady Navier–Stokes equation in ex27. That function would need to be modified to work with scipy.optimize.root
, but I've been meaning to do that anyway. Does that make sense? Sorry I can't go into better detail at the moment but I'll be back behind a computer tomorrow.
Thanks!
Well in a final try to compare apples with apples, I used the same mesh as I was using in dolfin
and it seems the mesh I had generated was really bad .
> python3 demo_hyperelasticityFEniCS.py
Solving nonlinear variational problem.
Newton iteration 0: r (abs) = 3.718e+01 (tol = 1.000e-10) r (rel) = 1.000e+00 (tol = 1.000e-09)
Newton iteration 1: r (abs) = 1.296e-01 (tol = 1.000e-10) r (rel) = 3.487e-03 (tol = 1.000e-09)
Newton iteration 2: r (abs) = 1.373e-02 (tol = 1.000e-10) r (rel) = 3.693e-04 (tol = 1.000e-09)
Newton iteration 3: r (abs) = 1.096e-03 (tol = 1.000e-10) r (rel) = 2.949e-05 (tol = 1.000e-09)
Newton iteration 4: r (abs) = 3.856e-06 (tol = 1.000e-10) r (rel) = 1.037e-07 (tol = 1.000e-09)
Newton iteration 5: r (abs) = 2.597e-10 (tol = 1.000e-10) r (rel) = 6.985e-12 (tol = 1.000e-09)
Newton solver finished in 5 iterations and 5 linear solver iterations.
scipy.linalg.root
instead of assembling Jacobian)> python3 hyperElasticity.py
0: |F(x)| = 0.152248; step 1
1: |F(x)| = 0.17018; step 1
2: |F(x)| = 0.0517355; step 0.362538
3: |F(x)| = 0.055933; step 1.06933
4: |F(x)| = 0.0655245; step 1
5: |F(x)| = 0.0254993; step 0.505042
6: |F(x)| = 0.0112387; step 1
7: |F(x)| = 0.00586981; step 0.505111
8: |F(x)| = 0.00299674; step 0.530948
9: |F(x)| = 0.00193922; step 1
10: |F(x)| = 6.8516e-05; step 1
11: |F(x)| = 4.7929e-06; step 1
The results are much better now! :-)
That function would need to be modified to work with
scipy.optimize.root
, but I've been meaning to do that anyway. Does that make sense? Sorry I can't go into better detail at the moment but I'll be back behind a computer tomorrow.
Thanks a lot. No problems, please take your time. I will also explore adding the body forces and surface traction, and hopefully by next weekend will have a complete working hyperelasticity demo. By the way, the above result is with a mesh comprising of
Points: 7225, Cells: 36864
so it is actually pretty good. No hurries at all. The fact that this turned out encouraging is a really good start. I will explore pacopy
too.
Have a good weekend!
using scipy.linalg.root instead of assembling Jacobian
Note that although root
#119, does JFNK by default, I believe it will use the Jacobian if provided. I'll see if I can demonstrate that in another take on ex10.
Thanks a lot!
I believe it will use the Jacobian if provided.
I can give this a try. Do you think that by passing the jacobian to scipy.linalg.root
and letting it solve the nonlinear system be significantly faster than the current jacobian free newton_krylov
? Since assembling the jacobian (as in the original code) was a bottleneck for this problem, I thought that if a jacobian free method would work (unless it converges horribly or stalls completely) it might be worth using it instead.
Note that although
root
#119, does JFNK by default
Just read the discussion carefully in #119. I get the same error.
TypeError: fsolve: there is a mismatch between the input and output shape of the 'fprime' argument 'jacobian'.Shape should be (21675, 21675) but it is (1,).
It seems scipy.optimize.least_squares
supports sparse jacobians. I will try it to see if it makes any difference.
Edit: (I think this stalls too, which makes me wonder if there is an issue with my jacobian -- will check this)
Iteration Total nfev Cost Cost reduction Step norm Optimality
0 1 3.1257e+00 1.27e+04
1 2 7.6182e-01 2.36e+00 9.73e-01 3.84e+03
2 3 7.1829e-01 4.35e-02 1.55e-01 2.51e+03
....
....
....
15 19 2.5323e-01 2.23e-03 1.19e-03 2.98e+03
16 20 2.5077e-01 2.46e-03 1.19e-03 2.54e+03
17 21 2.4803e-01 2.75e-03 1.19e-03 1.75e+03
I just had one more question regarding implementation of non-homogeneous neumann boundary condition. As in the weak form, does this look correct?
@LinearForm
def rhs(v, w):
x = w.x
return np.einsum("ij...,ij...", firstPKStress(w["w"]), grad(v)) - dot(bodyForce, v) - dot(surfaceTraction, v) * (restBoundary(x))
where the function restBoundary
would be something like
def restBoundary(x):
"""
returns `True` if not on the left/right face
for application of surface traction
"""
topBottom = np.logical_or(
x[1]==0., x[1]==1.
)
frontBack = np.logical_or(
x[2]==0., x[2]==1.
)
leftRight = np.logical_or(
x[0]==0., x[0] == 1.
)
return np.logical_and(np.logical_or(topBottom, frontBack), np.logical_not(leftRight))
I guess a last question (sorry for so many of these at once) that still doesn't remain clear to me is the L2_projection
to assign non-homogeneous dirichlet boundary conditions. In this case the right face (x=1
) is twisted by an angle (pi/3
) such that the displacements are
scale = y0 = z0 = 0.5
theta = pi/3.
u^1 = 0
u^2 = scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u^3 = scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)
As usual I created the corresponding lambda
functions, but what is not clear to me is which basis (InteriorBasis
or FacetBasis
) to use to extract the corresponding dofs
and consequently use in L2_projection
. I went ahead with InteriorBasis
which was used in the linear elasticity example to assign dirichlet BC.
u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2'])
u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3'])
But would definitely like to understand better about which ones to use and when. By the way when I tried switching to FacetBasis
there were convergence issues with scipy.linalg.root
.
Perhaps this will also help me explain a discrepancy that I see in the deformed configuration (corners of the cube at x=1
are distorted).
Since the right face is simply rotated, I don't expect the corners to warp out. I will quantitatively explore this by printing out the dofs at those locations. Also, for reference, this is how the dolfin
solution looks like
Just read the discussion carefully in #119. I get the same error.
Oh, no! Isn't that awful. I forgot about that, sorry. I still can't quite believe anyone would have bothered implementing a Newton–Krylov solver without sparsity! Sorry.
What about jac_options['method']
(in https://docs.scipy.org/doc/scipy/reference/optimize.root-krylov.html)
Krylov method to use to approximate the Jacobian. Can be a string, or a function implementing the same interface as the iterative solvers in scipy.sparse.linalg.
Or jac_options['inner_M']
?
Sorry, I'll look into these in due course; don't bother about them now.
To answer the prior question:
Do you think that by passing the jacobian to scipy.linalg.root and letting it solve the nonlinear system be significantly faster than the current jacobian free newton_krylov ? Since assembling the jacobian (as in the original code) was a bottleneck for this problem, I thought that if a jacobian free method would work (unless it converges horribly or stalls completely) it might be worth using it instead.
Really of course it would depend on all kinds of things, but what I was thinking was that the direct rather than Krylov-iterative solution of the Newton-increment equation might be faster in some regimes of size. That would mean passing the Jacobian to scipy.optimize.root
and perhaps not using method='krylov'
, or passing the actual assembled Jacobian through somehow (jac_options['inner_M']
?) so that the Krylov iteration for the Newon increment always converged in one step.
Sorry, this tangent should probably be split off into a separate issue.
Oh, no! Isn't that awful. I forgot about that, sorry. I still can't quite believe anyone would have bothered implementing a Newton–Krylov solver without sparsity! Sorry.
What about
jac_options['method']
(in https://docs.scipy.org/doc/scipy/reference/optimize.root-krylov.html)
No problems! Thanks a lot for your prompt reply. Infact I forgot to mention, I tried passing the jacobian according to the instructions on the main-page
@BilinearForm
def jacNewton(u, v, w):
return np.einsum('ijkl...,ij...,kl...', jacobianPK(w["w"]), grad(u), grad(v))
def jacobian(u: np.ndarray) -> np.ndarray:
jacb = asm(jacNewton, iBasis, w=iBasis.interpolate(u))
return jacb
# JFNK
u = root(residual, u, method='krylov', jac=jacobian, options={
'disp': True, "line_search": "wolfe"
}).x
and it gives me the same result as above.
If instead I try using the syntax as on the linked page, specifically for method="krylov"
via
u = root(
residual, u, method="krylov", options = {
"ftol":1.e-8, "line_search": "wolf",
"jac_options" : {
"method": jacobian
}
}
).x
it raises
TypeError: jacobian() got an unexpected keyword argument 'tol'
which is weird as I don't specify any key word tol. I will take a fresh look at the syntax tomorrow.
Really of course it would depend on all kinds of things, but what I was thinking was that the direct rather than Krylov-iterative solution of the Newton-increment equation might be faster in some regimes of size.
Agreed, absolutely!
That would mean passing the Jacobian to scipy.optimize.root and perhaps not using method='krylov', or passing the actual assembled Jacobian through somehow (jac_options['inner_M']?) so that the Krylov iteration for the Newon increment always converged in one step.
I see. I had tried passing the Jacobian and not specifying method="krylov"
, which defaults to using method="hybr"
and it doesn't make any difference. The only parameter that causes/breaks convergence is the line_search
. (the default armijo
breaks and wolfe
makes it converge). I don't know the details as to how each of these options implements the line search, although my hunch is that "wolfe" (after it's name) looks for alpha so that it satisfies the Wolfe conditions on the norm of the residual.
Thanks for your help! No hurries in responding to these queries. I will experiment on this code more this week as I get time. For eventual solutions on larger problems I also plan to take a look at petsc4py
and any matrix-free nonlinear solvers therein.
TypeError: jacobian() got an unexpected keyword argument 'tol'
Maybe that means that SciPy is calling your function jacobian
with a keyword argument tol
.
As usual I created the corresponding
lambda
functions, but what is not clear to me is which basis (InteriorBasis
orFacetBasis
) to use to extract the correspondingdofs
and consequently use inL2_projection
. I went ahead withInteriorBasis
which was used in the linear elasticity example to assign dirichlet BC.u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2']) u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3'])
But would definitely like to understand better about which ones to use and when.
I'd perform the projection using FacetBasis
if setting boundary conditions.
There is short discussion on this in the upcoming docs that have not yet been merged to master: https://github.com/kinnala/scikit-fem/pull/443/files#diff-0504026760cec473fa1293576d93538fR238
I think the use of InteriorBasis
explains the strange behavior at the corners.
Edit: When using Lagrange elements (such as ElementTetP1
) you can also simply define the displacement at the DOF location instead of using L2_projection
. Using L2_projection
is mainly useful for more complicated elements.
Edit 2: Something like
Dright = dofs["right"].nodal['u^2'].all()
u[Dright] = u2Right(*iBasis.dofslocs[:, Dright])
Since assembling the jacobian (as in the original code) was a bottleneck for this problem, I thought that if a jacobian free method would work (unless it converges horribly or stalls completely) it might be worth using it instead.
It's also possible to do a quasi-Newton method where you update the Jacobian only once in a while, e.g., every 10 or 20 iterations or whatever gives enough savings.
Technically, linear solve time scales something like O(N^3) and assembly only as O(N) (here N is the number of DOF's) so using large enough problem, solve should always dominate. But this particular example seems to counter-intuitively be complicated enough (assembly-wise; many floating-point operations inside the form evaluation) but still small enough (linear system size) that the assembly dominates here almost 10-fold.
Edit 2: Something like
Dright = dofs["right"].nodal['u^2'].all() u[Dright] = u2Right(*iBasis.dofslocs[:, Dright])
Thanks so much. This was the part I was missing. By the way,I think it should be
Dright = dofs["right"].nodal['u^2']
u[Dright] = u2Right(*iBasis.dofslocs[:, Dright])
Now it (left) perfectly matches the solution obtained via FEniCS(right) More quantitative comparisons to follow later in the week.
It's also possible to do a quasi-Newton method where you update the Jacobian only once in a while, e.g., every 10 or 20 iterations or whatever gives enough savings.
I see. I can try parameterizing the loading using a time-like variable and then at each time step update the Jacobian and use the same for all Newton iterations in that step. That's a nice suggestion.
Technically, linear solve time scales something like O(N^3) and assembly only as O(N) (here N is the number of DOF's) so using large enough problem, solve should always dominate. But this particular example seems to counter-intuitively be complicated enough (assembly-wise; many floating-point operations inside the form evaluation) but still small enough (linear system size) that the assembly dominates here almost 10-fold.
I agree completely that for larger problems solve would always dominate assembly (which is why I was thinking if using PETSc (via petsc4py
) would be any better for such cases). I think that the nonlinear problem even though complicated represents the simplest constitutive model in compressible hyperelasticity. And the fact that it worked out just fine (in a reasonable time) is more of an illustrative example than a performant code.
Thanks both of you for your help. This discussion helped my understanding of scikit-fem
improve a lot
Thanks both of you for your help. This discussion helped my understanding of
scikit-fem
improve a lot
No problem! In case you finish your work on this example, we're glad to include it in our documentation (https://scikit-fem.readthedocs.io/en/latest/listofexamples.html), if you're willing to contribute the code that is.
I think I could help improving skfem.helpers
at the same go so that we don't need to have det
and inv
defined in the code example as such.
I just had one more question regarding implementation of non-homogeneous neumann boundary condition. As in the weak form, does this look correct?
I think you need to integrate the tractions over the boundary of the domain.
This means that you need to define a separate form and assemble that using a FacetBasis
which is initialized over the respective part of the boundary (e.g. FacetBasis(mesh, element, facets=m.facets_satisfying(...)
.
Edit: The form could be something like:
@LinearForm
def traction(v, w):
return dot(surfaceTraction, v)
fBasis = FacetBasis(mesh, element, facets=m.facets_satisfying(restBoundary))
g = asm(traction, fBasis)
Edit 2: This is also equivalent to integrating over the entire boundary and using restBoundary
in the form:
@LinearForm
def traction(v, w):
return dot(surfaceTraction, v) * restBoundary(w.x)
fBasis = FacetBasis(mesh, element)
g = asm(traction, fBasis)
Edit 2: This is also equivalent to integrating over the entire boundary and using
restBoundary
in the form:
Perfect! Thanks a lot! I was previously assembling this with InteriorBasis
with the form multiplied with a conditional restBoundary(w.x)
. I will change this.
No problem! In case you finish your work on this example, we're glad to include it in our documentation (https://scikit-fem.readthedocs.io/en/latest/listofexamples.html), if you're willing to contribute the code that is.
Sure, absolutely! After making these changes, I will clean up the code and add a few blurbs about the problem (feel free to suggest what should/shouldn't be in it) and then comment here once that is done. If you want me to create a PR that's fine, even otherwise is fine, Thanks a lot!
More quantitative comparisons to follow later in the week.
A comparison of the dofs of the solution vector from scikit-FEM
and FEniCS
. I don't know if a better measure could be used to check the correctness of the implementation in scikit-FEM
(feel free to suggest if you think otherwise)
>>> np.linalg.norm(disps.point_data["u"] - solFEniCS.point_data["uFE"], axis=1).max()
>>> 3.2284499580733357e-10
That looks good.
I was stumped for a while in an earlier quantitative comparison effort to reproduce the linear elastic FEniCS tutorial example in gdmcbain/fenics-tuto-in-skfem#3. The initial quite marked discrepancy was due to large error on the FEniCS side, it seeming much more sensitive to the coarseness of the mesh for some reason; it went away when the mesh was refined. Anyway it doesn't look as though there's any similar difficulty here.
Part of the motivation for having those examples in that other repo rather than here was that I wasn't sure about the copyright or licensing status of the FEniCS originals; I'm not a lawyer.
@bhaveshshrimali (from #438):