Trying to generate multiple randomized synthetic datasets and getting error

jcmefra commented 2 months ago

Problem description

I'm trying to generate multiple randomized synthetic datasets and getting error on random iteration.

Your environment

Please provide the output of print(pygimli.Report()) here.

--------------------------------------------------------------------------------
  Date: Thu Mar 14 22:48:33 2024 Hora est. Pacífico, Sudamérica

                OS : Windows
            CPU(s) : 8
           Machine : AMD64
      Architecture : 64bit
               RAM : 31.9 GiB
       Environment : Jupyter

  Python 3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:29:04) [MSC
  v.1929 64 bit (AMD64)]

           pygimli : 1.4.6
            pgcore : 1.4.0
             numpy : 1.26.4
        matplotlib : 3.8.0
             scipy : 1.11.4
              tqdm : 4.66.2
           IPython : 8.18.1
           pyvista : 0.43.3

  Intel(R) oneAPI Math Kernel Library Version 2023.2-Product Build 20230613
  for Intel(R) 64 architecture applications
--------------------------------------------------------------------------------

Steps to reproduce

Tell us how to reproduce this issue. Ideally, you could paste the code that produces the error:

import numpy as np
import pygimli as pg
import pygimli.meshtools as mt
from pygimli.physics import ert
import pandas as pd
import matplotlib.pyplot as plt
import random
import os

def generate_random_synth_data(index):
    base_path = 'synth_data'
    synth_images_path = os.path.join(base_path, 'synth_images')
    rhoa_distribution_path = os.path.join(base_path, 'rhoa_distribution')
    ertdata_path = os.path.join(base_path, 'raw')

    # Ensure directories exist
    os.makedirs(synth_images_path, exist_ok=True)
    os.makedirs(rhoa_distribution_path, exist_ok=True)
    os.makedirs(ertdata_path, exist_ok=True)

    # Define the world randomly
    start_x = random.uniform(-50, -10)
    end_x = random.uniform(10, 50)
    depth = random.uniform(-30, -70)
    num_layers = random.randint(1, 5)

    world = mt.createWorld(start=[start_x, 0], end=[end_x, depth],
                       layers=[random.uniform(depth/3, 2*depth/3) for _ in range(num_layers - 1)], 
                       worldMarker=True)

    # Show the model and save the figure
    ax, _ = pg.show(world, showNodes=True)
    model_image_filename = f'ert_model_{index}.png'
    plt.savefig(os.path.join(synth_images_path, model_image_filename))
    plt.close()

    # Define electrode scheme randomly
    num_electrodes = random.randint(16, 64)
    elec_spacing = (end_x - start_x) / (num_electrodes - 1)
    elecs = np.linspace(start=start_x + elec_spacing, stop=end_x - elec_spacing, num=num_electrodes)
    schemes = ['wa', 'wb', 'dd', 'slm', 'pp', 'pd', 'hw', 'gr']
    scheme = ert.createData(elecs=elecs, schemeName=random.choice(schemes))

    # Generate the mesh
    mesh = mt.createMesh(world, quality=34)

    # Define resistivity values randomly for layers
    default_resistivity = 100
    layer_resistivities = [random.uniform(10, 200) for _ in range(num_layers)]
    rhomap = [[marker+1, res] for marker, res in enumerate(layer_resistivities)]
    rhomap.append([0, default_resistivity]) # Add default resistivity for any unspecified markers

    # Simulate measurements with random noise level
    noise_level = random.uniform(0, 0.05)  # Up to 5% noise
    noise_abs = random.uniform(0, 1e-4)  # Absolute noise
    data = ert.simulate(mesh, scheme=scheme, res=rhomap, noiseLevel=noise_level, noiseAbs=noise_abs)

    # Save simulation results
    data_filename = f'ertdata_{index}.dat'
    data.save(os.path.join(ertdata_path, data_filename))

    # Save the mesh with resistivity distribution as an image
    mesh_image_filename = f'ert_mesh_{index}.png'
    ax, _ = pg.show(mesh, data=rhomap, label=pg.unit('res'), showMesh=True)
    plt.savefig(os.path.join(synth_images_path, mesh_image_filename))
    plt.close()

    # Extracting node information for resistivity matrix
    node_coords = np.array([node.pos() for node in mesh.nodes()])
    x_coords = node_coords[:, 0]
    depths = node_coords[:, 1]
    resistivity_values = np.zeros(len(mesh.nodes()))

    for cell in mesh.cells():
        marker = cell.marker()
        resistivity = next(value for (mark, value) in rhomap if mark == marker)
        for node in cell.nodes():
            resistivity_values[node.id()] = resistivity

    # Combine into a matrix and save to CSV
    resistivity_matrix = pd.DataFrame({'X_coord': x_coords, 'Depth': depths, 'Resistivity': resistivity_values})
    resistivity_csv_filename = f'resistivity_distribution_{index}.csv'
    resistivity_matrix.to_csv(os.path.join(rhoa_distribution_path, resistivity_csv_filename), index=False)

# Generate datasets
for i in range(1, 15):
    generate_random_synth_data(i)
...

Expected behavior

To generate 15 random data

Actual behavior

It usually stops before finishing the 15 iterations, giving this output before stopping at certain iteration:

ModellingBase::setMesh() copying new mesh ... Found datafile: 45 electrodes
Found: 45 free-electrodes
rMin = 0.712976, rMax = 125.484
NGauLeg + NGauLag for inverse Fouriertransformation: 13 + 4
Found non-Neumann domain
0.0026121 s
FOP updating mesh dependencies ... 0.0003041 s
Traceback (most recent call last):
  File "d:\repos\PINN TRE\generate_synth_data.py", line 87, in <module>
    generate_random_synth_data(i)
  File "d:\repos\PINN TRE\generate_synth_data.py", line 56, in generate_random_synth_data
    data = ert.simulate(mesh, scheme=scheme, res=rhomap, noiseLevel=noise_level, noiseAbs=noise_abs)
  File "D:\Anaconda\envs\pg\lib\site-packages\pygimli\physics\ert\ert.py", line 203, in simulate
    resp = fop.response(res)
  File "D:\Anaconda\envs\pg\lib\site-packages\pygimli\physics\ert\ertModelling.py", line 178, in response
    resp = self._core.response(mod)
Boost.Python.ArgumentError: Python argument types in
    DCSRMultiElectrodeModelling.response(DCSRMultiElectrodeModelling, list)
did not match C++ signature:
    response(GIMLI::DCSRMultiElectrodeModelling {lvalue}, GIMLI::Vector<double> model, double background)
    response(DCSRMultiElectrodeModelling_wrapper {lvalue}, GIMLI::Vector<double> model)
    response(GIMLI::DCSRMultiElectrodeModelling {lvalue}, GIMLI::Vector<double> model)

I figured out it happens when the number of layers is only one (so there's only one marker), but I don't know how to fix that. Thank you!

If possible, please add one or more labels to your issue, e.g. if you expect that your issue is rather a question than a problem with the code, please add the label "question".

halbmy commented 2 months ago

Try using np.atleast_2d(rhomap) which will make it iterable. Another way could be to use a dictionary {1: 100, 2:50} instead of a map.

halbmy commented 1 month ago

Any success?

jcmefra commented 1 month ago

Hello, I tried a different approach but it's working fine so far, I have another question: can I define a square grid for the mesh for the iterable? I want my synthetic models to have a 64x64 grid instead of a triangle grid, so I can convert the resistivity points to a matrix. Thanks in advance.

halbmy commented 1 month ago

Of course you can, using createGrid. Please have a look at the ERT examples.

If the original problem is solved, please close the issue.

gimli-org / gimli