lululxvi / deepxde

A library for scientific machine learning and physics-informed learning
https://deepxde.readthedocs.io
GNU Lesser General Public License v2.1
2.66k stars 745 forks source link

2D Inverse Advection Diffusion problem will not converge to correct parameter. #251

Closed KetilIversen closed 2 years ago

KetilIversen commented 3 years ago

Thank you so much for this awesome library. I've been trying to solve the forward and inverse 2D Advection Diffusion equation (ADE). The forward problem converges fine with decent accuracy. However, when i try to do the inverse problem, the unknown parameter never converges to the correct value (thus also leading to large losses).

In my case the ADE is given as

CodeCogsEqn

Where

CodeCogsEqn(1)

The domain is 0<x<1, 0<y<1 and 0<t<1. For simplicity I've set the boundaries and initial condition to u=0.

This is supposed to simulate a simple toy problem where there is some source 'f(x,y,t)' leaking contaminant 'u' in the time dependent velocity field [2.0, 3.0cos(3pit)].

I am able to solve the forward problem relatively well with a ([3] + [20] * 8 + [1], "tanh", "Glorot normal") network. However, the inverse problem never converges to the correct parameters. For the inverse problem I'd like to predict Sx, the x-velocity.

CodeCogsEqn(2)

The 'exact' solution is contained within the 2D_ADE_matlab.mat file. It contains a 100x100x100 point solution for u in the domain. I select a random sample of these points as training data for the inverse problem (around 10.000+). However, through training the predicted Sx will just keep increasing until the end of training without converging to anything.

Things I have tried so far,

Varying training points. (Number of points in the random sample and num_domain, num_initial, num_boundary). Training for more epochs. (As Sx just keeps increasing, this does not seem like a likely solution to me) Decreasing learning rate. (Same result, only Sx increases more slowly) Adjusting loss_weights. (For example increasing weight for observe_u or PDE by 100. Default values do not lead to convergence either). Scaling PDE using apply_output_transform(). (Not sure how exactly this works, but scaling by larger values doesn't change much). I've also tried playing with different activation functions.

Unfortunately, the problem still persists.

The pure forward problem gets losses of [5.22e-04, 1.07e-03, 2.78e-04] after 30.000 epochs and gives an output (at t=1) like,

Expected Expectedloss

For the inverse problem i end up with losses of [3.55e-02, 3.52e-03, 5.16e-04, 3.26e-01] after 50.000 epochs and with Sx = 5.24. (I am aiming for Sx=2.0). The output (at t=1) is in this case,

Actual Actualloss

As we can see, it still gets the qualitative features correct but it predicts a wrong x-velocity also it is struggling with poor performance.

I'm hoping you have some ideas of what I am doing wrong. Thanks!

The code for the inverse problem is,

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import matplotlib.pyplot as plt 
import matplotlib.gridspec as gridspec
import deepxde as dde
import tensorflow as tf
import scipy.io

import matplotlib.animation
from IPython.display import HTML

def gen_testdata():
    t = np.linspace(0, 1, 100)
    x = np.linspace(0, 1, 100)
    y = np.linspace(0, 1, 100)
    xx, yy, tt = np.meshgrid(x,y,t)
    X = np.vstack((np.ravel(xx), np.ravel(yy), np.ravel(tt))).T
    return X

def gen_MATLABdata():
    data = scipy.io.loadmat("/content/2D_ADE_Matlab.mat")
    t, x, y, u_exact = data["timesteps"], data["x"], data["y"], data["u"].T
    xx, yy, tt = np.meshgrid(x,y,t)
    X = np.vstack((np.ravel(xx), np.ravel(yy), np.ravel(tt))).T
    u = u_exact.flatten()[:, None]
    return X, u

def pde(x, u):
    du_x = dde.grad.jacobian(u, x, i=0, j=0)
    du_y = dde.grad.jacobian(u, x, i=0, j=1)
    du_t = dde.grad.jacobian(u, x, i=0, j=2)
    du_xx = dde.grad.hessian(u, x, i=0, j=0)
    du_yy = dde.grad.hessian(u, x, i=1, j=1)
    return du_t + S_x * du_x + 3.0 * tf.cos(3 * np.pi * x[:,2:]) * du_y - 0.1 * du_xx - 0.1 * du_yy 
             - 70*tf.exp(-1000 * ((x[:, 0:1] - 0.2)**2 + (x[:,1:2] - 0.5)**2))

X_matlab, u_matlab = gen_MATLABdata()
S_x = tf.Variable(1.0)
rnd_idx = np.random.choice(np.array(range(len(X_matlab))), 10000, replace=False)
rnd_x = X_matlab[rnd_idx]
rnd_u = u_matlab[rnd_idx]

observe_x = rnd_x
observe_u = dde.PointSetBC(observe_x,rnd_u)

geom = dde.geometry.Rectangle([0,0], [1,1])
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, lambda x: 0.0, lambda _, on_boundary: on_boundary)
ic = dde.IC(geomtime, lambda x: 0.0, lambda _, on_initial: on_initial)

data = dde.data.TimePDE(geomtime, pde, [bc, ic, observe_u], num_domain=10000, num_boundary=500, num_initial=500,
                       anchors=observe_x, num_test=10000)

net = dde.maps.FNN([3] + [20] * 8 + [1], "tanh", "Glorot normal")
net.apply_output_transform(lambda x, y: y*2)
model = dde.Model(data, net)

model.compile("adam", lr=0.001, loss_weights = [1, 1, 1, 100])
variable = dde.callbacks.VariableValue([S_x], period=1000)
losshistory, train_state = model.train(epochs=50000, callbacks=[variable])

#PLOTTING
X = gen_testdata()
u_pred = model.predict(X)
X_plot = X.reshape(100,100,100,3)
u_plot = u_pred.reshape(100,100,100)
x_test = X_plot[:,:,0,0]
y_test = X_plot[:,:,0,1]

fig = plt.figure(figsize=(8, 8))
def animate(i):
    plt.clf()
    plt.pcolor(x_test, y_test, u_plot[:,:,i], cmap='rainbow', vmin=0.0, vmax=0.5)
    plt.xlabel('x')
    plt.ylabel('y')
    plt.colorbar()
ani = matplotlib.animation.FuncAnimation(fig, animate, frames=range(0,100), interval=50)
HTML(ani.to_jshtml())
engsbk commented 3 years ago

I'm experiencing the same with inverse 1D Burger equation. Did you try changing the initial guess of S_x?

lululxvi commented 3 years ago

If the forward problem is correct, then the code should be OK. You may try:

  1. different initial guess of S_x
  2. two step training procedure in https://doi.org/10.1371/journal.pcbi.1007575 page 6.
KetilIversen commented 3 years ago

Unfortunately changing the initial guess still leads to the value of S_x growing. I done some experimenting by demoting the problem to 1D. In the 1D case it works well and it is really robust to changes in initial S_x values. It is very strange why the 2D case does not want to cooperate. I will have a look at the paper you sent.

Also, if i may ask some extra general questions.

  1. Whenever the model spits out the losses, for example [3.55e-02, 3.52e-03, 5.16e-04, 3.26e-01]. Are these different values corresponding to the targets in the following line?

data = dde.data.TimePDE(geomtime, pde, [bc, ic, observe_u], num_domain=10000, num_boundary=500, num_initial=500, anchors=observe_x, num_test=10000)

So 3.55e-02 would be the loss for pde, and 3.52e-03 would be the loss for bc and so on?

  1. Also, when scaling the losses with loss_weights = [], is it good practice to scale the losses such that they are all of the same order to each other or is scaling done more by trial and error?

  2. And finally. As far as I can tell. When you do inverse problems, you provide the model a set of N, datapoints. So when i set num_domain=10000, num_boundary=500, num_initial=500, this adds another 11000 points for the model to train on for a total of N+11000 points. Is this correct? Is it even necessary to have these extra points since the model is already fed N datapoints? What is the difference? Because at first i thought num_domain would select a subset of the N datapoints that are fed to the model in observe_u to train on.

Thanks!

lululxvi commented 3 years ago
  1. Yes
  2. Yes. But sometimes the data loss can be slightly larger.
  3. It is not necessary. Sometimes adding more points for training could improve the accuracy.

You may try:

  1. Init S to be the true value, and set it to be not trainable;
  2. Try another f with slow decay.