weak forms for hyperelasticity

gdmcbain commented 4 years ago

@bhaveshshrimali (from #438):

Thanks! I was starting to take a look at implementing the hyperelasticity demo. The most natural examples to follow seem to be linear elasticity, nonlinear poisson and laplace with inhomogeneous bcs.

What would be the best place to look for functions like log, inv, det (like in UFL) when writing the weak forms? For instance the stress would have an expression that looks like the following in UFL:
def firstPKStress(u):
F = Identity(len(u)) + grad(u) 
J = det(F)
return mu * F - mu * inv(F).T + J * (J-1) * inv(F).T
I can see the helper functions eye and grad, which should help in defining F as eye(1, u.shape[0]) + grad(u), but how about the determinant (det) and inverse (inv) ?

bhaveshshrimali commented 4 years ago

Thanks a lot for keeping the discussion organized. Will open a separate issue (so that the main topic remains as is) for each topic/sub-topic in the future :)

gdmcbain commented 4 years ago

Usually these kinds of operations are just done with NumPy, e.g. numpy.einsum. The physical indices (indexing over component or dimension) usually precede the numerical indices (indexing over element or quadrature point). Most of the NumPy operations do the right thing in broadcasting over the numerical indices, but, not all of them. I remember having a difficulty with one of them recently...

gdmcbain commented 4 years ago

Yeah, does numpy.linalg.inv work if the ndarray indices are shuffled around so that the ones over component are at the end?

bhaveshshrimali commented 4 years ago

Indeed it does! (never knew numpy.linalg.inv could handle higher dimensional arrays). So this means that inverse would then be:

from numpy import linalg as nla
def inv(T):
    reshapedT = np.einsum("ijk->kij",T)   # assuming that T is of shape `ndim x ndim x numQuadraturePoints` ?
    return np.einsum("ijk->kij", nla.inv(reshapedT))

and I think det does too

kinnala commented 4 years ago

Please note that since this is a nonlinear problem and we don't (yet?) have anything similar to derivative in FEniCS you have to either

calculate the Jacobian of the Piola-Kirchhoff stress by hand
try to use SumPy/Mathematica or similar symbolic engine to help in obtaining the Jacobian
try to apply something like JAX to do it in an automatized manner

In the past, I have done options 1 and 2 for hyperelasticity and tried option 3 for a simpler test problem. At some point I had a goal of encapsulating option 3 into scikit-fem so that it's simple to use but it is currently on hold as I've been focusing on other things.

gdmcbain commented 4 years ago

Do Jacobian-free Newton–Krylov methods work well on this class of.problems?

kinnala commented 4 years ago

Do Jacobian-free Newton–Krylov methods work well on this class of.problems?

Oh yes, that's another option. You might need to do some work though to find a good preconditioner but I suppose small problems should work pretty well?

bhaveshshrimali commented 4 years ago

calculate the Jacobian of the Piola-Kirchhoff stress by hand

This was my plan. I had done this previously for a simple hyperelasticity problem sometime back. The idea was to use SymPy to calculate the derivatives with respect to the invariants and then lambdify the the result.

try to apply something like JAX to do it in an automatized manner

Interesting. Although I could never manage to install jax on a windows machine in the past. Might be worthwhile to follow up on this.

gdmcbain commented 4 years ago

You might need to do some work though to find a good preconditioner but I suppose small problems should work pretty well?

Yes, I think so. A (good or otherwise) preconditioner is not a strict requirement for Jacobian-free Newton–Krylov, as shown for ex10 when demonstrating scipy.optimize.root as a nonlinear solver #119. That example is a small problem but root(method='krylov') works really well for it. I think this is just using unpreconditioned lgmres. I think JFNK is a good first option for preliminary exploration of nonlinear problems on small meshes. I'll post the branch.

bhaveshshrimali commented 4 years ago

@gdmcbain @kinnala , so I got some time today to start re-implementing the FEniCS demo in scikit-fem I just wanted to make sure that I understand the syntax correctly to assemble the rhs and jacobian. For now (with the analytical jacobian), I have


import pygmsh as pm
import numpy as np
from skfem.io import from_meshio
from skfem.helpers import ddot, grad, eye, dot, transpose
from math import pi
from skfem import *

def vinv(T):
    reshapedT = np.einsum("ijk->kij",T)
    return np.einsum("ijk->kij", np.linalg.inv(reshapedT))

def vdet(T):
    reshapedT = np.einsum("ijk->kij",T)
    return np.einsum("ijk->kij", np.linalg.det(reshapedT))

def firstPKStress(u):
    F = eye(1., u.shape[0]) + grad(u)
    J = vdet(F)
    return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))

# TODO: check the expression once again
def jacobianPK(u):
    F = eye(1., u.shape[0]) + grad(u)
    J = vdet(F)
    Finv = vinv(F)
    dFdF = np.einsum("ik...,jl...->ikjl...", eye(1., u.shape[0]), eye(1., u.shape[0]))
    dFinvdF = -np.einsum("ik...,jl...->ijkl",Finv, Finv)
    C = mu * dFdF - mu * dFinvdF - lmbda * J * (J- 1) * dFinvdF + (2*J - 1) * J * np.einsum("ji...,lk...->ijkl",Finv, Finv)
    return C

geom = pm.opencascade.Geometry(1.e-1, 5.e-1)
box = geom.add_box([0., 0., 0.], [1., 1., 1.])
msh = pm.generate_mesh(geom, dim=3)

# import mesh in skfem
mesh = from_meshio(msh)
elem = ElementTetP1()
uelem = ElementVectorH1(elem)
iBasis = InteriorBasis(mesh, uelem, intorder=3)
fBasis = FacetBasis(mesh, uelem, intorder=3)
u = np.zeros(iBasis.N) # this takes care of dimension

# materialParams and init
bodyForce = np.array([0., -1./2, 0])
E, nu = 10., 0.3
mu = E/2/(1+nu)
lmbda = 2*mu*nu/(1-2*nu)

dofs = {
    "left": fBasis.get_dofs(lambda x: np.isclose(x[0], 0.)),
    "right": fBasis.get_dofs(lambda x: np.isclose(x[0], 1.))
}

I = iBasis.complement_dofs(dofs)

# assign DirichletBC
# variables used in the FEniCS demo
scale = y0 = z0 = 0.5
theta = pi/3.

u1Right = 0.
u2Right = lambda x,y,z: scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u3Right = lambda x,y,z: scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)

u[dofs["left"].nodal['u^1']] = 0.
u[dofs["left"].nodal['u^2']] = 0.
u[dofs["left"].nodal['u^3']] = 0.
u[dofs["right"].nodal['u^1']] = u1Right
u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, fBasis, dofs["right"].nodal['u^2'])
u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, fBasis, dofs["right"].nodal['u^3'])

@LinearForm
def rhs(v, w):
    return ddot(firstPKStress(v), grad(w["w"])) #+ dot(bodyForce, w["w"])

@BilinearForm
def jac(u, v, w):
    return ddot(ddot(grad(v), jacobianPK(u)), grad(w["w"]))

for itr in range(100):
    w = iBasis.interpolate(u)
    J = asm(jac, iBasis, w=w)  # not able to assemble properly
    F = asm(rhs, iBasis, w=w)
    u_prev = u.copy()
    u += solve(*condense(J, -F, I=I))  # try a full Newton Step
    if np.linalg.norm(u - u_prev) < 1e-8:
        break
    if __name__ == "__main__":
        print(np.linalg.norm(u - u_prev))

I think I am missing something when trying to correctly declare the linear and bilinear forms. I will probably take a fresh look later in the day tomorrow at how eye works. However, if you happen to spot something obvious, please let me know. Thanks!

This is the traceback,

F = eye(1., u.shape[0]) + grad(u)
AttributeError: 'DiscreteField' object has no attribute 'shape'

kinnala commented 4 years ago

I'm currently writing some additional Sphinx documentation on the subject "Anatomy of forms". I hope to add some tips for defining forms.

AttributeError: 'DiscreteField' object has no attribute 'shape'

A short explanation: parameters u and v in the form definition are DiscreteField objects, basically a tuple of ndarray's: https://github.com/kinnala/scikit-fem/blob/master/skfem/element/discrete_field.py#L7. Anyhow, they don't have a property shape.

Edit: I'd have to take a more careful look at the forms to propose a fix. I can look at it later when I have more time.

kinnala commented 4 years ago

So mathematically grad(u) is 2-tensor and you want to add ones to the diagonal.

You can get forwards by replacing

F = eye(1., u.shape[0]) + grad(u)

with

F = grad(u)
F[0, 0] += 1
F[1, 1] += 1
F[2, 2] += 1

I think skfem.helpers.eye is probably not the most useful function for this use case in it's current form. I think we need to consider replacing it with something better. I cannot recall why it was written that way.

kinnala commented 4 years ago

Maybe a more relevant helper for this purpose would be

def eye(u):
    v = np.zeros_like(u)
    for i in range(v.shape[0]):
        v[i, i] += 1
    return v

But this is good discussion also on what kind of helpers we should have in skfem.helpers and how to improve it.

bhaveshshrimali commented 4 years ago

Thanks a lot for the help! I added the ellipsis to a couple of more places (after printing grad(u).shape)

def vinv(T):
    reshapedT = np.einsum("ij...->...ij",T)
    return np.einsum("...ij->ij...", np.linalg.inv(reshapedT))

and the same for determinant. And also changed to using your second suggestion.

def firstPKStress(u):
    F = grad(u)
    for i in range(mesh.p.shape[0]):
        F[i,i] += 1.
    J = vdet(F)
    return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))

and in the jacobian. Everything seems to be fine at least syntax-wise (maybe I am missing something). Will take a careful look at the forms and bcs later in the day. Thanks so much for your help!!

bhaveshshrimali commented 4 years ago

Corrected (?) the forms (still haven't implemented the non-homogeneous Neumann condition). Will clean up the code in some time.


def firstPKStress(u):
    F = grad(u)
    for i in range(3):
        F[i,i] += 1.
    J = vdet(F)
    return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))

def jacobianPK(u):
    F = grad(u)
    eye = np.zeros_like(F)
    for i in range(3):
        F[i,i] += 1.
        eye[i,i] += 1.
    J = vdet(F)
    Finv = vinv(F)
    dFdF = np.einsum("ik...,jl...->ijkl...", eye, eye)
    dFinvdF = np.einsum("jk...,li...->ijkl...", Finv, Finv)
    C = mu * dFdF + mu * dFinvdF
    - lmbda * J * (J- 1) * dFinvdF
    + lmbda * (2*J - 1) * J * np.einsum("ji...,lk...->ijkl...",Finv, Finv)
    return C

@LinearForm
def rhs(v, w):
    return ddot(firstPKStress(w["w"]), grad(v)) + dot(bodyForce, v)

@BilinearForm
def jac(u, v, w):
    return ddot(ddot(grad(u), jacobianPK(w["w"])), grad(v))

However I noticed that assembling the bilinear form with ~7k dofs (precisely Points: 2290, Cells: 10191) takes about 3min on my laptop (will do precise profiling and report in some time). I suspect I may be doing something inefficiently in the jacobianPK function that affects the assembly.

kinnala commented 4 years ago

assembling the bilinear form with ~7k dofs (precisely Points: 2290, Cells: 10191) takes about 3min on my laptop

Can you post your full code including vdet and vinv so we can check it out?

kinnala commented 4 years ago

The implementation of vinv looks a bit suspicious to me as I'm not sure how the dimensions would go if you wanted to use np.linalg.inv for the inversion. Another option is to write the inverse explicitly using Cramer's rule as is done here: https://github.com/kinnala/scikit-fem/blob/master/skfem/mapping/mapping_affine.py#L49.

bhaveshshrimali commented 4 years ago

assembling the bilinear form with ~7k dofs (precisely Points: 2290, Cells: 10191) takes about 3min on my laptop

Can you post your full code including vdet and vinv so we can check it out?

Sure (and thanks!). It is basically the full snippet that I posted above with the changes to the functions:


import pygmsh as pm
import numpy as np
from skfem.io import from_meshio
from skfem.helpers import ddot, grad, dot, transpose
from math import pi
from skfem import *

# define inverse:
def vinv(T):
    """
    physical dimensions precede numerical (see: https://github.com/kinnala/scikit-fem/issues/439#issuecomment-661649339)
    T_ij... (where i, j are physical dimensions)
    """
    reshapedT = np.einsum("ij...->...ij", T)
    return np.einsum("...ij->ij...", np.linalg.inv(reshapedT))

def vdet(T):
    """
    physical dimensions precede numerical (see: https://github.com/kinnala/scikit-fem/issues/439#issuecomment-661649339)
    T_ij... (where i, j are physical dimensions)
    """
    reshapedT = np.einsum("ij...->...ij", T)
    return np.einsum("...ij->ij...", np.linalg.det(reshapedT))

def firstPKStress(u):
    F = grad(u)
    F[0,0] += 1.
    F[1,1] += 1.
    F[2,2] += 1.
    J = vdet(F)
    return mu * F - mu * transpose(vinv(F)) + lmbda * J * (J - 1) * transpose(vinv(F))

def jacobianPK(u):
    F = grad(u)
    eye = np.zeros_like(F)
    for i in range(3):
        F[i,i] += 1.
        eye[i,i] += 1.
    J = vdet(F)
    Finv = vinv(F)
    dFdF = np.einsum("ik...,jl...->ijkl...", eye, eye)
    dFinvdF = np.einsum("jk...,li...->ijkl...", Finv, Finv)
    C = mu * dFdF + mu * dFinvdF -\
        lmbda * J * (J- 1) * dFinvdF +\
            lmbda * (2*J - 1) * J * np.einsum("ji...,lk...->ijkl...",Finv   , Finv)
    # print(C[0,0,0,0].min())
    return C

geom = pm.opencascade.Geometry(1.e-2, 1.e-1)
box = geom.add_box([0., 0., 0.], [1., 1., 1.])

msh = pm.generate_mesh(geom)
mesh = from_meshio(msh)
elem = ElementTetP1()
uelem = ElementVectorH1(elem)
iBasis = InteriorBasis(mesh, uelem, intorder=2)
fBasis = FacetBasis(mesh, uelem, intorder=2)
u = np.zeros(iBasis.N) #this takes care of dimension

# materialParams and init
bodyForce = np.array([0., -1./2, 0])
E, nu = 10., 0.3
mu = E/2/(1+nu)
lmbda = 2*mu*nu/(1-2*nu)
dofs = {
    "left": iBasis.get_dofs(lambda x: x[0]==0),
    "right": iBasis.get_dofs(lambda x: x[0]==1.)
}

# assign DirichletBC
# variables used in the FEniCS demo
scale = y0 = z0 = 0.5
theta = pi/3.

# scaling factor: bta: for Newton's method'
bta = 1.

u1Right = 0.
u2Right = lambda x,y,z: scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u3Right = lambda x,y,z: scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)

rightNodes = mesh.p[:,mesh.nodes_satisfying(lambda x: np.isclose(x[0], 1.))]
leftNodes = mesh.p[:,mesh.nodes_satisfying(lambda x: np.isclose(x[0], 0.))]

u[dofs["left"].nodal['u^1']] = 0.
u[dofs["left"].nodal['u^2']] = 0.
u[dofs["left"].nodal['u^3']] = 0.
u[dofs["right"].nodal['u^1']] = 0.1
# u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2']) # u2Right(rightNodes[0], rightNodes[1], rightNodes[2])
# u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3']) # u3Right(rightNodes[0], rightNodes[1], rightNodes[2])
I = iBasis.complement_dofs(dofs)

@LinearForm
def rhs(v, w):
    return ddot(firstPKStress(w["w"]), grad(v)) #+ dot(bodyForce, v)

@BilinearForm
def jac(u, v, w):
    return ddot(grad(v), ddot(jacobianPK(w["w"]), grad(u)))

for itr in range(2): #100
    print("Iteration: {}".format(itr+1))
    w = iBasis.interpolate(u)
    # print(w)
    J = asm(jac, iBasis, w=w)
    F = asm(rhs, iBasis, w=w)
    u_prev = u.copy()
    u += bta * solve(*condense(J, -F, I=I))
    print(np.linalg.norm(u_prev-u))
    print("lmbda + 2mu = {}".format(lmbda+2.*mu ))

You are right, there is something off with the calculation of the jacobian, and most likely it is in how inverse is calculated. But I also checked the deformation gradient and it seems to be off too.

Will do some thorough checks today in the evening.

Thanks again

bhaveshshrimali commented 4 years ago

The implementation of vinv looks a bit suspicious to me as I'm not sure how the dimensions would go if you wanted to use np.linalg.inv for the inversion. Another option is to write the inverse explicitly using Cramer's rule as is done here: https://github.com/kinnala/scikit-fem/blob/master/skfem/mapping/mapping_affine.py#L49.

I remember stumbling accross this at the very beginning when I was looking for helper functions, but then the documentation in numpy.linalg.inv seemed to suggest that it will work when the physical indices are shifted to the end. I will check the explicit inversion (and try it too). Thanks!

I tried doing a small check:

import numpy as np
a = np.random.random((3,3,100,4))
ainvVec = np.einsum("...ij->ij...", np.linalg.inv(np.einsum("ij...->...ij", a)))
ainvLooped = np.array([
    [np.linalg.inv(a[:,:,i,j]) for j in range(a.shape[3])]
    for i in range(a.shape[2])
])
print(np.allclose(np.einsum("...ij->ij...", ainvLooped), ainvVec)) # returns `True`

kinnala commented 4 years ago

Here is the time on my laptop. I'll check quickly if there is something I can do to improve it.

In [2]: %time asm(jac, iBasis, w=w)                                                                                                                 
CPU times: user 17.9 s, sys: 16.1 s, total: 33.9 s
Wall time: 11.3 s
Out[2]: 
<3603x3603 sparse matrix of type '<class 'numpy.float64'>'
    with 135385 stored elements in Compressed Sparse Row format>

bhaveshshrimali commented 4 years ago

Thanks a lot! I had previously done some vectorized assembly calculations (with loop over number of local basis) and remember that the time wasn't that huge. But that was a linear problem in 2D.

kinnala commented 4 years ago

Here is something:

def vdet(A):
    detA = np.zeros_like(A[0, 0])
    detA = A[0, 0] * (A[1, 1] * A[2, 2] -
                      A[1, 2] * A[2, 1]) -\
           A[0, 1] * (A[1, 0] * A[2, 2] -
                      A[1, 2] * A[2, 0]) +\
           A[0, 2] * (A[1, 0] * A[2, 1] -
                      A[1, 1] * A[2, 0])
    return detA

def vinv(A):
    invA = np.zeros_like(A)
    detA = vdet(A)
    invA[0, 0] = (-A[1, 2] * A[2, 1] +
                  A[1, 1] * A[2, 2]) / detA
    invA[1, 0] = (A[1, 2] * A[2, 0] -
                  A[1, 0] * A[2, 2]) / detA
    invA[2, 0] = (-A[1, 1] * A[2, 0] +
                  A[1, 0] * A[2, 1]) / detA
    invA[0, 1] = (A[0, 2] * A[2, 1] -
                  A[0, 1] * A[2, 2]) / detA
    invA[1, 1] = (-A[0, 2] * A[2, 0] +
                  A[0, 0] * A[2, 2]) / detA
    invA[2, 1] = (A[0, 1] * A[2, 0] -
                  A[0, 0] * A[2, 1]) / detA
    invA[0, 2] = (-A[0, 2] * A[1, 1] +
                  A[0, 1] * A[1, 2]) / detA
    invA[1, 2] = (A[0, 2] * A[1, 0] -
                  A[0, 0] * A[1, 2]) / detA
    invA[2, 2] = (-A[0, 1] * A[1, 0] +
                  A[0, 0] * A[1, 1]) / detA
    return invA, detA

and then use Finv, J = vinv(F) in jacobianPK. It gives

In [1]: %timeit asm(jac, iBasis, w=w)                                                                                                               
3.89 s ± 491 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

unless I made some grave mistake.

The rest of the time comes mainly from moving data between low-level and high-level code because this line has quite a bit of operations:

    C = mu * dFdF + mu * dFinvdF -\
        lmbda * J * (J- 1) * dFinvdF +\
            lmbda * (2*J - 1) * J * np.einsum("ji...,lk...->ijkl...",Finv   , Finv)

A package called numexpr might help here but not sure.

Btw, I think

@BilinearForm
def jac(u, v, w):
    return ddot(grad(v), ddot(jacobianPK(w["w"]), grad(u)))

is not what you're thinking: ddot is not detecting that jacobianPK is 4-tensor. I'll try to propose a fix.

kinnala commented 4 years ago

Maybe this is right?

@BilinearForm
def jac(u, v, w):
    return np.einsum('ijkl...,ij...,kl...', jacobianPK(w["w"]), grad(u), grad(v))

bhaveshshrimali commented 4 years ago

I will check these. Really appreciate your help on this.

bhaveshshrimali commented 4 years ago

Maybe this is right?

@BilinearForm
def jac(u, v, w):
    return np.einsum('ijkl...,ij...,kl...', jacobianPK(w["w"]), grad(u), grad(v))

This is exactly what I had in mind when trying to write the form. I didn't check, even though I checked the skfem.helpers.py file, ddot carefully enough. Thanks so much

bhaveshshrimali commented 4 years ago

The rest of the time comes mainly from moving data between low-level and high-level code because this line has quite a bit of operations:

I see, Thanks!. Will also try experimenting with the optimize flag in einsum as well as comparing with opt_einsum

bhaveshshrimali commented 4 years ago

I implemented the changes in the code, and it seems to work. This is without the body forces and surface tractions (purely Dirichlet conditions). Will post the complete (translated) demo on a refined mesh once I implement surface tractions (non-homogeneous Neumann conditions) with some sanity checks.

trialFEniCS

I am currently running one with ~21k dofs (original system), and it takes a while to assemble. Will keep experimenting with this over the weekend. Hopefully next week will complete fine-tuning this as well as moving over to incompressible elasticity.

bhaveshshrimali commented 4 years ago

So I tried again with explicitly assembling the Jacobian this time with opt_einsum's contract (hoping to get some speed up) but the results don't change much (einsum itself is optimized I think). Assembly still being the bottleneck, with Newton also stalling after ~30 iterations (I tried full as well as damped newton).

Do Jacobian-free Newton–Krylov methods work well on this class of.problems?

Thanks a lot! This was a great suggestion. It seems to be performing significantly faster (and rightly so) than vanilla Newton solve that I tried above (for larger problems). I replaced the newton iteration loop with the following (as mentioned in #442 )

def nonlinearResidual(u: np.ndarray) -> np.ndarray:
    res = asm(rhs, iBasis, w=iBasis.interpolate(u))
    res[D] = 0.
    return res

u = root(nonlinearResidual, u, method='krylov', options={
    'disp': True, "line_search": "wolfe",
}).x

Complete code (without neumann bc's):


import pygmsh as pm
import numpy as np
from skfem.io import from_meshio
from skfem.helpers import ddot, grad, dot, transpose
from math import pi
from numba import jit
from opt_einsum import contract
import warnings, meshio
from scipy.optimize.nonlin import BroydenFirst, KrylovJacobian, InverseJacobian
# warnings.filterwarnings('ignore')
from skfem import *

# define inverse:
@jit(cache=True, nopython=True, nogil=True, parallel=True)
def vdet(A):
    detA = np.zeros_like(A[0, 0])
    detA = A[0, 0] * (A[1, 1] * A[2, 2] -
                      A[1, 2] * A[2, 1]) -\
           A[0, 1] * (A[1, 0] * A[2, 2] -
                      A[1, 2] * A[2, 0]) +\
           A[0, 2] * (A[1, 0] * A[2, 1] -
                      A[1, 1] * A[2, 0])
    return detA

@jit(cache=True, nopython=True, nogil=True, parallel=True)
def vinv(A):
    invA = np.zeros_like(A)
    detA = vdet(A)
    invA[0, 0] = (-A[1, 2] * A[2, 1] +
                  A[1, 1] * A[2, 2]) / detA
    invA[1, 0] = (A[1, 2] * A[2, 0] -
                  A[1, 0] * A[2, 2]) / detA
    invA[2, 0] = (-A[1, 1] * A[2, 0] +
                  A[1, 0] * A[2, 1]) / detA
    invA[0, 1] = (A[0, 2] * A[2, 1] -
                  A[0, 1] * A[2, 2]) / detA
    invA[1, 1] = (-A[0, 2] * A[2, 0] +
                  A[0, 0] * A[2, 2]) / detA
    invA[2, 1] = (A[0, 1] * A[2, 0] -
                  A[0, 0] * A[2, 1]) / detA
    invA[0, 2] = (-A[0, 2] * A[1, 1] +
                  A[0, 1] * A[1, 2]) / detA
    invA[1, 2] = (A[0, 2] * A[1, 0] -
                  A[0, 0] * A[1, 2]) / detA
    invA[2, 2] = (-A[0, 1] * A[1, 0] +
                  A[0, 0] * A[1, 1]) / detA
    return invA, detA

def firstPKStress(u):
    F = np.zeros_like(grad(u))
    F[0,0] += 1.
    F[1,1] += 1.
    F[2,2] += 1.

    F += grad(u)
    Finv, J = vinv(F)

    return mu * F - mu * transpose(Finv) + lmbda * J * (J - 1) * transpose(Finv)

geom = pm.opencascade.Geometry(1.e-2, 5.e-2) #5.e-3, 1.e-2
box = geom.add_box([0., 0., 0.], [1., 1., 1.])
msh = pm.generate_mesh(geom, geo_filename="box.geo")

mesh = from_meshio(msh)
elem = ElementTetP1()
uelem = ElementVectorH1(elem)

iBasis = InteriorBasis(mesh, uelem, intorder=2)
fBasis = FacetBasis(mesh, uelem, intorder=2)
u = np.zeros(iBasis.N) #this takes care of dimension

# materialParams and init
bodyForce = np.array([0., -1./2, 0])
surfaceTraction = np.array([0.1, 0, 0]) 
E, nu = 10., 0.3
mu = E/2/(1+nu)
lmbda = 2*mu*nu/(1-2*nu)

dofs = {
    "left":  iBasis.get_dofs(lambda x: x[0]==0),
    "right": iBasis.get_dofs(lambda x: x[0]==1.)
    }

# assign DirichletBC
# variables used in the FEniCS demo
scale = y0 = z0 = 0.5
theta = pi/3.

u1Right = 0.
u2Right = lambda x,y,z: scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u3Right = lambda x,y,z: scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)

u[dofs["left"].nodal['u^1']] = 0.
u[dofs["left"].nodal['u^2']] = 0.
u[dofs["left"].nodal['u^3']] = 0.
u[dofs["right"].nodal['u^1']] = 0.
u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2'])
u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3'])

I = iBasis.complement_dofs(dofs)
D = iBasis.complement_dofs(I)

# scaling factor: bta: for Newton's method'
bta = 0.7

@LinearForm
def rhs(v, w):
    return ddot(firstPKStress(w["w"]), grad(v)) #+ dot(bodyForce, v)

def nonlinearResidual(u: np.ndarray) -> np.ndarray:
    res = asm(rhs, iBasis, w=iBasis.interpolate(u))
    res[D] = 0.
    return res

u = root(nonlinearResidual, u, method='krylov', options={
    'disp': True, "line_search": "wolfe",
}).x

meshio.xdmf.write("trialFEniCS.xdmf",meshio.Mesh(
    mesh.p.T, cells={"tetra":mesh.t.T}, point_data={"u":u[iBasis.nodal_dofs].T}
))

Eventually even this stalls

...
...
33:  |F(x)| = 0.0559137; step 0.01
34:  |F(x)| = 0.0554543; step 0.01
35:  |F(x)| = 0.0549778; step 0.01
36:  |F(x)| = 0.0547684; step 0.01
37:  |F(x)| = 0.0551101; step 0.01
38:  |F(x)| = 0.0554195; step 0.01
39:  |F(x)| = 0.0557679; step 0.01

bhaveshshrimali commented 4 years ago

I will keep iterating on this (especially since the solver could be fine-tuned) but it seems that overall scikit-fem can very well handle nonlinear variational problems in mechanics dealing with minimization of non-quadratic functionals to the point where numpy itself runs out of options.

I will also look into installation of num_exp as it did not seem to be distinctly clear (I tried cloning and the usual python3 setup.py build technique)

Thanks for your help!

gdmcbain commented 4 years ago

Very interesting. Thank you for following up and posting this. I'm in the country for a few days so can't go into detail, but in brief, beside's kinnala's suggestion of the importance of preconditioning, I wonder whether it's a question of the basin of convergence. Perhaps the problem might be revised to be less nonlinear and then the original problem chased by natural parameter continuation. I used the external pacopy.natural for this for the steady Navier–Stokes equation in ex27. That function would need to be modified to work with scipy.optimize.root, but I've been meaning to do that anyway. Does that make sense? Sorry I can't go into better detail at the moment but I'll be back behind a computer tomorrow.

bhaveshshrimali commented 4 years ago

Thanks! Well in a final try to compare apples with apples, I used the same mesh as I was using in dolfin and it seems the mesh I had generated was really bad .

Results from FEniCS

> python3 demo_hyperelasticityFEniCS.py
Solving nonlinear variational problem.
  Newton iteration 0: r (abs) = 3.718e+01 (tol = 1.000e-10) r (rel) = 1.000e+00 (tol = 1.000e-09)
  Newton iteration 1: r (abs) = 1.296e-01 (tol = 1.000e-10) r (rel) = 3.487e-03 (tol = 1.000e-09)
  Newton iteration 2: r (abs) = 1.373e-02 (tol = 1.000e-10) r (rel) = 3.693e-04 (tol = 1.000e-09)
  Newton iteration 3: r (abs) = 1.096e-03 (tol = 1.000e-10) r (rel) = 2.949e-05 (tol = 1.000e-09)
  Newton iteration 4: r (abs) = 3.856e-06 (tol = 1.000e-10) r (rel) = 1.037e-07 (tol = 1.000e-09)
  Newton iteration 5: r (abs) = 2.597e-10 (tol = 1.000e-10) r (rel) = 6.985e-12 (tol = 1.000e-09)
  Newton solver finished in 5 iterations and 5 linear solver iterations.

Results from scikit-fem (using `scipy.linalg.root` instead of assembling Jacobian)

> python3 hyperElasticity.py
0:  |F(x)| = 0.152248; step 1
1:  |F(x)| = 0.17018; step 1
2:  |F(x)| = 0.0517355; step 0.362538
3:  |F(x)| = 0.055933; step 1.06933
4:  |F(x)| = 0.0655245; step 1
5:  |F(x)| = 0.0254993; step 0.505042
6:  |F(x)| = 0.0112387; step 1
7:  |F(x)| = 0.00586981; step 0.505111
8:  |F(x)| = 0.00299674; step 0.530948
9:  |F(x)| = 0.00193922; step 1
10:  |F(x)| = 6.8516e-05; step 1
11:  |F(x)| = 4.7929e-06; step 1

The results are much better now! :-)

trialFEniCS

bhaveshshrimali commented 4 years ago

That function would need to be modified to work with scipy.optimize.root, but I've been meaning to do that anyway. Does that make sense? Sorry I can't go into better detail at the moment but I'll be back behind a computer tomorrow.

Thanks a lot. No problems, please take your time. I will also explore adding the body forces and surface traction, and hopefully by next weekend will have a complete working hyperelasticity demo. By the way, the above result is with a mesh comprising of

Points: 7225, Cells: 36864

so it is actually pretty good. No hurries at all. The fact that this turned out encouraging is a really good start. I will explore pacopy too. Have a good weekend!

gdmcbain commented 4 years ago

using scipy.linalg.root instead of assembling Jacobian

Note that although root #119, does JFNK by default, I believe it will use the Jacobian if provided. I'll see if I can demonstrate that in another take on ex10.

bhaveshshrimali commented 4 years ago

Thanks a lot!

I believe it will use the Jacobian if provided.

I can give this a try. Do you think that by passing the jacobian to scipy.linalg.root and letting it solve the nonlinear system be significantly faster than the current jacobian free newton_krylov ? Since assembling the jacobian (as in the original code) was a bottleneck for this problem, I thought that if a jacobian free method would work (unless it converges horribly or stalls completely) it might be worth using it instead.

bhaveshshrimali commented 4 years ago

Note that although root #119, does JFNK by default

Just read the discussion carefully in #119. I get the same error.

TypeError: fsolve: there is a mismatch between the input and output shape of the 'fprime' argument 'jacobian'.Shape should be (21675, 21675) but it is (1,).

It seems scipy.optimize.least_squares supports sparse jacobians. I will try it to see if it makes any difference.

Edit: (I think this stalls too, which makes me wonder if there is an issue with my jacobian -- will check this)

   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality
       0              1         3.1257e+00                                    1.27e+04
       1              2         7.6182e-01      2.36e+00       9.73e-01       3.84e+03
       2              3         7.1829e-01      4.35e-02       1.55e-01       2.51e+03
....
....
....
      15             19         2.5323e-01      2.23e-03       1.19e-03       2.98e+03
      16             20         2.5077e-01      2.46e-03       1.19e-03       2.54e+03
      17             21         2.4803e-01      2.75e-03       1.19e-03       1.75e+03

bhaveshshrimali commented 4 years ago

I just had one more question regarding implementation of non-homogeneous neumann boundary condition. As in the weak form, does this look correct?

@LinearForm
def rhs(v, w):
    x = w.x
    return np.einsum("ij...,ij...", firstPKStress(w["w"]), grad(v)) - dot(bodyForce, v) - dot(surfaceTraction, v) * (restBoundary(x))

where the function restBoundary would be something like

def restBoundary(x):
    """
    returns `True` if not on the left/right face
    for application of surface traction
    """
    topBottom = np.logical_or(
        x[1]==0., x[1]==1.
    )
    frontBack = np.logical_or(
        x[2]==0., x[2]==1.
    )
    leftRight = np.logical_or(
        x[0]==0., x[0] == 1.
    )
    return np.logical_and(np.logical_or(topBottom, frontBack), np.logical_not(leftRight))

bhaveshshrimali commented 4 years ago

I guess a last question (sorry for so many of these at once) that still doesn't remain clear to me is the L2_projection to assign non-homogeneous dirichlet boundary conditions. In this case the right face (x=1) is twisted by an angle (pi/3) such that the displacements are

scale = y0 = z0 = 0.5
theta = pi/3.
u^1 = 0
u^2 = scale*(y0 + (y - y0)*np.cos(theta) - (z - z0)*np.sin(theta) - y)
u^3 = scale*(z0 + (y - y0)*np.sin(theta) + (z - z0)*np.cos(theta) - z)

As usual I created the corresponding lambda functions, but what is not clear to me is which basis (InteriorBasis or FacetBasis) to use to extract the corresponding dofs and consequently use in L2_projection. I went ahead with InteriorBasis which was used in the linear elasticity example to assign dirichlet BC.

u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2']) 
u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3'])

But would definitely like to understand better about which ones to use and when. By the way when I tried switching to FacetBasis there were convergence issues with scipy.linalg.root.

Perhaps this will also help me explain a discrepancy that I see in the deformed configuration (corners of the cube at x=1 are distorted).

FEniCS_bf

Since the right face is simply rotated, I don't expect the corners to warp out. I will quantitatively explore this by printing out the dofs at those locations. Also, for reference, this is how the dolfin solution looks like

FEniCS_bf_actual

gdmcbain commented 4 years ago

Just read the discussion carefully in #119. I get the same error.

Oh, no! Isn't that awful. I forgot about that, sorry. I still can't quite believe anyone would have bothered implementing a Newton–Krylov solver without sparsity! Sorry.

What about jac_options['method'] (in https://docs.scipy.org/doc/scipy/reference/optimize.root-krylov.html)

Krylov method to use to approximate the Jacobian. Can be a string, or a function implementing the same interface as the iterative solvers in scipy.sparse.linalg.

Or jac_options['inner_M']?

Sorry, I'll look into these in due course; don't bother about them now.

To answer the prior question:

Do you think that by passing the jacobian to scipy.linalg.root and letting it solve the nonlinear system be significantly faster than the current jacobian free newton_krylov ? Since assembling the jacobian (as in the original code) was a bottleneck for this problem, I thought that if a jacobian free method would work (unless it converges horribly or stalls completely) it might be worth using it instead.

Really of course it would depend on all kinds of things, but what I was thinking was that the direct rather than Krylov-iterative solution of the Newton-increment equation might be faster in some regimes of size. That would mean passing the Jacobian to scipy.optimize.root and perhaps not using method='krylov', or passing the actual assembled Jacobian through somehow (jac_options['inner_M']?) so that the Krylov iteration for the Newon increment always converged in one step.

Sorry, this tangent should probably be split off into a separate issue.

bhaveshshrimali commented 4 years ago

Oh, no! Isn't that awful. I forgot about that, sorry. I still can't quite believe anyone would have bothered implementing a Newton–Krylov solver without sparsity! Sorry.

What about jac_options['method'] (in https://docs.scipy.org/doc/scipy/reference/optimize.root-krylov.html)

No problems! Thanks a lot for your prompt reply. Infact I forgot to mention, I tried passing the jacobian according to the instructions on the main-page


@BilinearForm
def jacNewton(u, v, w):
    return np.einsum('ijkl...,ij...,kl...', jacobianPK(w["w"]), grad(u), grad(v))

def jacobian(u: np.ndarray) -> np.ndarray:
    jacb = asm(jacNewton, iBasis, w=iBasis.interpolate(u))
    return jacb

# JFNK
u = root(residual, u, method='krylov', jac=jacobian, options={
        'disp': True, "line_search": "wolfe"
    }).x

and it gives me the same result as above.

If instead I try using the syntax as on the linked page, specifically for method="krylov" via


u = root(
    residual, u, method="krylov", options = {
        "ftol":1.e-8, "line_search": "wolf",
        "jac_options" : {
            "method": jacobian
        }
    }
).x

it raises

TypeError: jacobian() got an unexpected keyword argument 'tol'

which is weird as I don't specify any key word tol. I will take a fresh look at the syntax tomorrow.

Really of course it would depend on all kinds of things, but what I was thinking was that the direct rather than Krylov-iterative solution of the Newton-increment equation might be faster in some regimes of size.

Agreed, absolutely!

That would mean passing the Jacobian to scipy.optimize.root and perhaps not using method='krylov', or passing the actual assembled Jacobian through somehow (jac_options['inner_M']?) so that the Krylov iteration for the Newon increment always converged in one step.

I see. I had tried passing the Jacobian and not specifying method="krylov", which defaults to using method="hybr" and it doesn't make any difference. The only parameter that causes/breaks convergence is the line_search. (the default armijo breaks and wolfe makes it converge). I don't know the details as to how each of these options implements the line search, although my hunch is that "wolfe" (after it's name) looks for alpha so that it satisfies the Wolfe conditions on the norm of the residual.

Thanks for your help! No hurries in responding to these queries. I will experiment on this code more this week as I get time. For eventual solutions on larger problems I also plan to take a look at petsc4py and any matrix-free nonlinear solvers therein.

gdmcbain commented 4 years ago

TypeError: jacobian() got an unexpected keyword argument 'tol'

Maybe that means that SciPy is calling your function jacobian with a keyword argument tol.

kinnala commented 4 years ago

As usual I created the corresponding lambda functions, but what is not clear to me is which basis (InteriorBasis or FacetBasis) to use to extract the corresponding dofs and consequently use in L2_projection. I went ahead with InteriorBasis which was used in the linear elasticity example to assign dirichlet BC.
u[dofs["right"].nodal['u^2']] = L2_projection(u2Right, iBasis, dofs["right"].nodal['u^2']) 
u[dofs["right"].nodal['u^3']] = L2_projection(u3Right, iBasis, dofs["right"].nodal['u^3'])
But would definitely like to understand better about which ones to use and when.

I'd perform the projection using FacetBasis if setting boundary conditions.

There is short discussion on this in the upcoming docs that have not yet been merged to master: https://github.com/kinnala/scikit-fem/pull/443/files#diff-0504026760cec473fa1293576d93538fR238

I think the use of InteriorBasis explains the strange behavior at the corners.

Edit: When using Lagrange elements (such as ElementTetP1) you can also simply define the displacement at the DOF location instead of using L2_projection. Using L2_projection is mainly useful for more complicated elements.

Edit 2: Something like

Dright = dofs["right"].nodal['u^2'].all()
u[Dright] = u2Right(*iBasis.dofslocs[:, Dright])

kinnala commented 4 years ago

Since assembling the jacobian (as in the original code) was a bottleneck for this problem, I thought that if a jacobian free method would work (unless it converges horribly or stalls completely) it might be worth using it instead.

It's also possible to do a quasi-Newton method where you update the Jacobian only once in a while, e.g., every 10 or 20 iterations or whatever gives enough savings.

Technically, linear solve time scales something like O(N^3) and assembly only as O(N) (here N is the number of DOF's) so using large enough problem, solve should always dominate. But this particular example seems to counter-intuitively be complicated enough (assembly-wise; many floating-point operations inside the form evaluation) but still small enough (linear system size) that the assembly dominates here almost 10-fold.

bhaveshshrimali commented 4 years ago

Edit 2: Something like

Dright = dofs["right"].nodal['u^2'].all()
u[Dright] = u2Right(*iBasis.dofslocs[:, Dright])

Thanks so much. This was the part I was missing. By the way,I think it should be


Dright = dofs["right"].nodal['u^2']
u[Dright] = u2Right(*iBasis.dofslocs[:, Dright])

Now it (left) perfectly matches the solution obtained via FEniCS(right) More quantitative comparisons to follow later in the week.

FEniCSvsscikit

bhaveshshrimali commented 4 years ago

It's also possible to do a quasi-Newton method where you update the Jacobian only once in a while, e.g., every 10 or 20 iterations or whatever gives enough savings.

I see. I can try parameterizing the loading using a time-like variable and then at each time step update the Jacobian and use the same for all Newton iterations in that step. That's a nice suggestion.

Technically, linear solve time scales something like O(N^3) and assembly only as O(N) (here N is the number of DOF's) so using large enough problem, solve should always dominate. But this particular example seems to counter-intuitively be complicated enough (assembly-wise; many floating-point operations inside the form evaluation) but still small enough (linear system size) that the assembly dominates here almost 10-fold.

I agree completely that for larger problems solve would always dominate assembly (which is why I was thinking if using PETSc (via petsc4py) would be any better for such cases). I think that the nonlinear problem even though complicated represents the simplest constitutive model in compressible hyperelasticity. And the fact that it worked out just fine (in a reasonable time) is more of an illustrative example than a performant code.

Thanks both of you for your help. This discussion helped my understanding of scikit-fem improve a lot

kinnala commented 4 years ago

Thanks both of you for your help. This discussion helped my understanding of scikit-fem improve a lot

No problem! In case you finish your work on this example, we're glad to include it in our documentation (https://scikit-fem.readthedocs.io/en/latest/listofexamples.html), if you're willing to contribute the code that is.

I think I could help improving skfem.helpers at the same go so that we don't need to have det and inv defined in the code example as such.

kinnala commented 4 years ago

I just had one more question regarding implementation of non-homogeneous neumann boundary condition. As in the weak form, does this look correct?

I think you need to integrate the tractions over the boundary of the domain.

This means that you need to define a separate form and assemble that using a FacetBasis which is initialized over the respective part of the boundary (e.g. FacetBasis(mesh, element, facets=m.facets_satisfying(...).

Edit: The form could be something like:

@LinearForm
def traction(v, w):
    return dot(surfaceTraction, v)

fBasis = FacetBasis(mesh, element, facets=m.facets_satisfying(restBoundary))

g = asm(traction, fBasis)

Edit 2: This is also equivalent to integrating over the entire boundary and using restBoundary in the form:

@LinearForm
def traction(v, w):
    return dot(surfaceTraction, v) * restBoundary(w.x)

fBasis = FacetBasis(mesh, element)

g = asm(traction, fBasis)

bhaveshshrimali commented 4 years ago

Edit 2: This is also equivalent to integrating over the entire boundary and using restBoundary in the form:

Perfect! Thanks a lot! I was previously assembling this with InteriorBasis with the form multiplied with a conditional restBoundary(w.x). I will change this.

bhaveshshrimali commented 4 years ago

No problem! In case you finish your work on this example, we're glad to include it in our documentation (https://scikit-fem.readthedocs.io/en/latest/listofexamples.html), if you're willing to contribute the code that is.

Sure, absolutely! After making these changes, I will clean up the code and add a few blurbs about the problem (feel free to suggest what should/shouldn't be in it) and then comment here once that is done. If you want me to create a PR that's fine, even otherwise is fine, Thanks a lot!

bhaveshshrimali commented 4 years ago

More quantitative comparisons to follow later in the week.

A comparison of the dofs of the solution vector from scikit-FEM and FEniCS . I don't know if a better measure could be used to check the correctness of the implementation in scikit-FEM (feel free to suggest if you think otherwise)

>>> np.linalg.norm(disps.point_data["u"] - solFEniCS.point_data["uFE"], axis=1).max()
>>> 3.2284499580733357e-10

gdmcbain commented 4 years ago

That looks good.

I was stumped for a while in an earlier quantitative comparison effort to reproduce the linear elastic FEniCS tutorial example in gdmcbain/fenics-tuto-in-skfem#3. The initial quite marked discrepancy was due to large error on the FEniCS side, it seeming much more sensitive to the coarseness of the mesh for some reason; it went away when the mesh was refined. Anyway it doesn't look as though there's any similar difficulty here.

Part of the motivation for having those examples in that other repo rather than here was that I wasn't sure about the copyright or licensing status of the FEniCS originals; I'm not a lawyer.

kinnala / scikit-fem