gimli-org / gimli

Geophysical Inversion and Modeling Library :earth_africa:
https://www.pygimli.org
Other
344 stars 131 forks source link

Segmentation Fault with custom Constraint Matrix #659

Closed soehag closed 2 months ago

soehag commented 2 months ago

Problem description

Hi everyone,

I am running ERT-inversions with custom constraint matrices to account for the specific geometry in my setup. Running these inversions on the my local HPC, I run into segmentation fault errors when the constraint matrix is being build. I have attached a small minimum working example, where an identity matrix as constraint matrix crashes the jupyter kernel before it can properly throw an exception. Running as a script I encounter the same error in a mwe on my local machine.

Your environment


Date: Wed Feb 28 11:09:28 2024 CET

            OS : Linux
        CPU(s) : 64
       Machine : x86_64
  Architecture : 64bit
           RAM : 251.5 GiB
   Environment : Jupyter
   File system : ext4

Python 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:07:37) [GCC 12.3.0]

       pygimli : 1.4.4
        pgcore : 1.4.0
         numpy : 1.24.4
    matplotlib : 3.7.2
         scipy : 1.11.3
       IPython : 8.17.2
       pyvista : 0.42.3

Steps to reproduce

Tell us how to reproduce this issue. Ideally, you could paste the code that produces the error:

import pygimli as pg
import pygimli.meshtools as mt
import numpy as np
import pygimli.physics.ert as ert

mesh_inner_plc = pg.meshtools.createWorld(np.array([10,4]), np.array((20, -4)), worldMarker=True, marker=1)
mesh_inner_plc.setBoundaryMarkers([-2]*4)
mesh_inner = pg.meshtools.createMesh(mesh_inner_plc, quality=32, area=.1, smooth=True)
final_mesh=mesh_inner
pg.show(final_mesh, markers=True, showMesh=True)

### Create data
sensorPositions_x = np.linspace(12,18,11)
sensorPositions_top = np.vstack((sensorPositions_x, np.ones_like(sensorPositions_x)*2)).T
sensorPositions_bot = np.vstack((sensorPositions_x, np.ones_like(sensorPositions_x)*(-2))).T
scheme = pg.DataContainerERT()
for pos in sensorPositions_top:
    scheme.createSensor(pos)
for pos in sensorPositions_bot:
    scheme.createSensor(pos)
for sen_top in range(len(sensorPositions_top)):
    for sen_bot in range(len(sensorPositions_top),len(sensorPositions_top)+len(sensorPositions_bot)):
        scheme.addFourPointData(sen_top,-1,sen_bot,-1)
scheme["k"] +=1
res = np.array([h.center().y() for h in final_mesh.cells()])**2+100
pg.show(final_mesh, res)

syn_data = ert.simulate(mesh=final_mesh, scheme=scheme, res=res, noiseLevel=1)
pg.show(final_mesh, markers=True, showMesh=True)

### Invert with ERT manager
sM=100
manager = ert.Manager()
manager.setData(syn_data)
manager.setMesh(final_mesh)

C=pg.Matrix(np.eye(len(final_mesh.cells())))
weights = pg.Vector(np.ones(len(final_mesh.cells())))
weights = np.ones(len(final_mesh.cells()))
# weights = 1

manager.inv.setRegularization(C=C)
manager.inv.setConstraintWeights(weights)

result = manager.invert(
    data=syn_data,
    mesh=final_mesh,
    lam =1e2,
    startModel=sM,
    verbose=True
)

Expected behavior

Run properly.

Actual behavior

... Building constraints matrix Segmentation fault (core dumped)

ModellingBase::setMesh() copying new mesh ... Found datafile: 22 electrodes
Found: 22 free-electrodes
rMin = 0.3, rMax = 14.4222
NGauLeg + NGauLag for inverse Fouriertransformation: 10 + 4
Found non-Neumann domain
0.00672143 s
FOP updating mesh dependencies ... 3.85e-06 s
Calculating response for model: min = 100 max = 115.381
Allocating memory for primary potential...... 0.000443408

No primary potential for secondary field calculation. Calculating analytically...
Forward: time: 0.0617532s
Response: min = 1.22867 max = 2.19441 mean = 1.89735
Reciprocity rms(modelReciprocity) 0.0936967%, max: 0.240693%
relativeError set to a value > 0.5 .. assuming this is a percentage Error level dividing them by 100
Data error estimate (min:max)  0.010045556265465905 : 0.010081375519050945
28/02/24 - 11:11:12 - pyGIMLi - INFO - Found 1 regions.
28/02/24 - 11:11:12 - pyGIMLi - INFO - Found 1 regions.
28/02/24 - 11:11:12 - pyGIMLi - INFO - Creating forward mesh from region infos.
28/02/24 - 11:11:12 - pyGIMLi - INFO - Creating refined mesh (H2) to solve forward task.
28/02/24 - 11:11:12 - pyGIMLi - INFO - Mesh for forward task: Mesh: Nodes: 2740 Cells: 5300 Boundaries: 4064
min/max(dweight) = 99.1928/99.5465
28/02/24 - 11:11:12 - pyGIMLi - INFO - Starting inversion.
fop: <pygimli.physics.ert.ertModelling.ERTModelling object at 0x7fa2ee027560>
Data transformation: <pgcore._pygimli_.RTransLogLU object at 0x7fa2e549f100>
Model transformation: <pgcore._pygimli_.RTransLog object at 0x7fa2e549e5c0>
min/max (data): 1.23/2.24
min/max (error): 1%/1.01%
min/max (start model): 100/100
--------------------------------------------------------------------------------
Calculating response for model: min = 100 max = 100
Allocating memory for primary potential...... 0.00057357

No primary potential for secondary field calculation. Calculating analytically...
Forward: time: 0.251483s
Response: min = 1.10435 max = 1.99086 mean = 1.71883
Reciprocity rms(modelReciprocity) 0.0226251%, max: 0.0574602%
min/max(dweight) = 99.1928/99.5465
Building constraints matrix
Segmentation fault (core dumped)

bug

halbmy commented 2 months ago

I don't think the C matrix is the problem, but rather the line:

manager.setData(np.random.random)

The argument of setData should be a DataContainerERT and not a function, not even a vector as a result of the function. It should be

manager.setData(syn_data)

(and we should check the type).

soehag commented 2 months ago

Hi, sorry for my silly mistake - that might be something left behind from me experimenting. This is obviously not supposed to be there. I edited the code - replacing np.random.random with syn_data.

soehag commented 2 months ago

Hi, sorry for my silly mistake - that might be something left behind from me experimenting. This is obviously not supposed to be there. I edited the code - replacing np.random.random with syn_data.

My intention was to say - the error persists. This seemed to be a copying mistake from my side.

florian-wagner commented 2 months ago

Dear Hagen @soehag,

interesting issue. The example runs fine if you use the pyGIMLi data type for identity matrices (C=pg.matrix.IdentityMatrix(len(final_mesh.cells()))) or a SparseMapMatrix. Other matrix types seem to segfault.

@carsten-forty2 ?

C_npy = np.eye(len(final_mesh.cells()))

from scipy import sparse
C1 = pg.matrix.IdentityMatrix(len(final_mesh.cells())) # works
C3 = pg.utils.numpy2gmat(C_npy) # segfaults, similiar to original issue
C4 = pg.utils.toSparseMatrix(C_npy) # segfaults
C5 = pg.utils.toSparseMatrix(sparse.csr_matrix(C_npy)) # segfaults
C6 = pg.matrix.SparseMapMatrix(C5) # works
carsten-forty2 commented 2 months ago

The problem is the lazy evaluation of the mesh with its associated regularization settings that overwrites your custom C.

The final mesh for the forward operator FOP is set inside the manager.invert call. After that, all regularization parameter are forwarded to the regionmanager. This clears all prior settings like your C. The log info shows when a new mesh is set to the FOP.

The FIX added a custom C into these lazy evaluation process. Until the FIX (dev branch) comes with the next release, you can force the lazy evaluation by calling the mesh once before you set C with manager.fop.mesh(). Also you should remove the mesh setting inside the invert call because this restarts the C cleaning again.

soehag commented 2 months ago

Thank you very much for your contributions and the quick fix - that seams to work well! Thank you also for your hint of not passing the mesh along again - that was also a remainder of trying different settings.

On a side note - if you do not set the regularisation with a custom constraints matrix by inv.setRegularization(), but pass it as manager.invert(..., C=C), pyGIMLi throws the error:

"Core - ERROR - no cWeights defined. You should create constraints matrix first."

but procedes with the inversion. Afterwards the constraints matrix can be found in the forward-operator, so it seems like it was used in the inversion. Maybe, if the matrix is properly used, it is reasonable to just put an info and set the cWeights automatically to 1 or throw an error and not conduct the inversion? Right now it is not entirely clear, what pyGIMLi does in this situation especially since manager.invert() doesnt accept arguments for cWeights.

Cheers and thank you again for you help!