Error replicating FWI example using NLopt

advaitb commented 5 months ago

Hi,

I'm basically trying to replicate examples 1 and 2 from the notebooks and executing the code below:

using PyPlot, HDF5, SegyIO, JUDI, NLopt, SlimOptim, Statistics, Random, Printf, SlimPlotting, LinearAlgebra
shape = (201, 201) # Number of gridpoints nx, nz
spacing = (10.0, 10.0) # in meters here
origin = (0.0, 0.0) # in meters as well

vp = 1.5f0 * ones(Float32, shape)
vp[:, 66:end] .= 2.0f0
vp[:, 134:end] .= 2.5f0

VP = PhysicalParameter(vp, spacing, origin);

model = Model(shape, spacing, origin, 1f0./vp.^2f0; nb = 80)

nsrc = 11
xsrc = range(0f0, (shape[1] -1)*spacing[1], length=nsrc)
ysrc = 0f0 .* xsrc # this a 2D case so we set y to 0
zsrc = 12.5f0*ones(Float32, nsrc);

xsrc, ysrc, zsrc = convertToCell.([xsrc, ysrc, zsrc]);

nrec = 101
xrec = range(0f0, (shape[1] -1)*spacing[1], length=nrec)
yrec = 0f0 # this a 2D case so we set y to 0. This can be a single number for receivers
zrec = (66*spacing[1])*ones(Float32, nrec);

record_time = 2000f0 # Recording time in ms (since we have m/ms for the velocity)
sampling_rate = 4f0; # Let's use a standard 4ms sampling rate

src_geom = Geometry(xsrc, ysrc, zsrc; dt=sampling_rate, t=record_time)
rec_geom = Geometry(xrec, yrec, zrec, dt=sampling_rate, t=record_time, nsrc=nsrc);

f0 = 0.010 # Since we use ms, the frequency is in KHz
wavelet = ricker_wavelet(record_time, sampling_rate, f0);
q = judiVector(src_geom, wavelet)
Pr = judiProjection(rec_geom) # receiver interpolation
Ps = judiProjection(src_geom) # Source interpolation
Ainv = judiModeling(model) # Inverse of the disrete ewave equation.
d_obs = Pr * Ainv * Ps' * q

batchsize = 8;
count = 0;

function objf!(x, grad)
    if count == 0
        @printf("%10s %15s %15s\n","Iteration","Function Val","norm(g)")
    end
    model.m .= Float32.(reshape(x, model.n))

    i = randperm(d_obs.nsrc)[1:batchsize]
    fval, gradient = fwi_objective(model, q[i], d_obs[i])

    gradient = reshape(gradient, model.n)
    gradient[:,1:21] .= 0f0
    if length(grad) > 0
        grad[1:end] = vec(gradient)
    end
    global count += 1
    @printf("%10d %15.5e %15.5e\n",count, fval, norm(g))
    return convert(Float64, fval)
end

g = zeros(prod(model.n))
f0 = objf!(vec(model.m), g)
global count = 0;

However, running f0 = objf!(vec(model0.m), g) gives me an error. I'm running Julia 1.10. The full error trace is attached below. Could you please point to where the issue is? Thanks! judi_error.txt

mloubout commented 5 months ago

I've been seeing these lengthy error trace on my M1 as well upgrading to julia 1.10, I'm not quite sure why. I have tried to look a bit into it but it is a very intriguing error message. Weirdest part is that it looks like it still runs fine despite the error message if you just ignore it.

I'll keep an eye on this issue and investigate more to see if I can find the source of it

advaitb commented 5 months ago

Thanks for your help, Matthias. It is indeed weird that it seems to run fine, however I went ahead and tried to plot the FWI first gradient and it looks like something definitely got messed up.

fig = figure(figsize=(7,13))
imshow(reshape(g, model.n)', vmin=-1e3, vmax=1e3, extent=(0,10,3,0), cmap="jet")
title("FWI first gradient")
xlabel("Lateral position [km]");
ylabel("Depth [km]");
display(fig)

mloubout commented 5 months ago

This is becasue you compute the gradient in the true model so the residual is zero (minus numerical accuracy) so it's basically zero gradient

kerim371 commented 5 months ago

@advaitb just wondering, wich OS do you use and what distribution?

advaitb commented 5 months ago

@mloubout Gotcha, let me try it with a different model as given in the example.

@kerim371 I'm running this on AWS so Amazon Linux '23 (RHEL).

advaitb commented 5 months ago

@mloubout I ran the script with a new defined model0 for optimization which I calculated as follows:

vp0 = 1f0*ones(Float32, shape)
vp0[:, 45:120] .= 0.25f0
vp0[:, 120:165] .= 3f0
vp0[:, 165:end] .= 2.5f0
model0 = Model((201, 201), (10,10), (0,0), 1f0./vp0.^2f0; nb = 80)

This is different from the model I defined above to generate the seismic data and I also modified the FWI objective to resemble the example in the notebook:


# NLopt objective function
function objf!(x, grad)
    if count == 0
        @printf("%10s %15s %15s\n","Iteration","Function Val","norm(g)")
    end
    # Update model
    model0.m .= Float32.(reshape(x, model0.n))

    # Seclect batch and calculate gradient
    i = randperm(d_obs.nsrc)[1:batchsize]
    fval, gradient = fwi_objective(model0, q[i], d_obs[i])

    # Reset gradient in water column to zero
    gradient = reshape(gradient, model0.n)
    gradient[:,1:21] .= 0f0
    if length(grad) > 0
        grad[1:end] = vec(gradient)
    end
    global count += 1
    @printf("%10d %15.5e %15.5e\n",count, fval, norm(g))
    return convert(Float64, fval)
end

g = zeros(prod(model0.n))
f0 = objf!(vec(model0.m), g)
global count = 0;

The output of the FWI first gradient unfortunately is still the same as before so that doesn't seem to work for me. Do let me know if there is something else I'm missing. Also still getting the pesky internal error but that's best ignored for now.

mloubout commented 5 months ago

There was a typo in a type definition leading to the error message, fixed in latest master and will be in the next release

slimgroup / JUDI.jl

Error replicating FWI example using NLopt #224