lululxvi / deepxde

A library for scientific machine learning and physics-informed learning
https://deepxde.readthedocs.io
GNU Lesser General Public License v2.1
2.64k stars 739 forks source link

Solving an inverse problem using "2nd-derivative measurements" of y #208

Closed DavidBraun97 closed 3 years ago

DavidBraun97 commented 3 years ago

Hi @lululxvi , Thanks for your wonderful framework!

I've ran into problems trying to solve an inverse problem of the following type: Given some measurements of y'' (i.e. acceleration) I want to inversely learn some unknown parameters in my PDE. That is, the NN output y corresponds to deflections.

In the case of direct y-measurements (i.e. finitely many displacements) everything works fine using this BC: dde.DirichletBC(geomtime, ptset.values_to_func( func_y(observeX) ), lambda x, : ptset.inside(x))

Now I want to solve the same problem. However, instead of y-measurements I now only have finitely many measurements of the acceleration y''. In other words, instead of supplying [X_observe, y_observe] I am supplying [X_observe, y''_observe].

Can I solve this type of problem in DEEPXDE? If yes, could you provide me with a hint on how to implement the respective BC?

Thank you very much! Cheers, David

lululxvi commented 3 years ago

Yes, you can. Use OperatorBC in a similar way, see https://deepxde.readthedocs.io/en/latest/modules/deepxde.html#deepxde.boundary_conditions.OperatorBC and FAQ for using OperatorBC

DavidBraun97 commented 3 years ago

Thanks @lululxvi for your reply! I have tried to follow your advise using OperatorBC. Still, I am not sure if this is the correct way of including the measurements [X_observe, y1''_observe] into my PINN [x,t] --> [y1,y2].

...
def dy1dt(x,y):
  return dde.grad.jacobian(y, x, i=0, j=1)
def dy1dtt(x,y):
  return tf.gradients(dy1dt(x,y), x)[0][:, :1]
...
def func_y1dotdot_measured(x,y,X):
  residual = **Y1dotdot_Meas** - dy1dtt(x,y)
  return residual
...
ptset = dde.bc.PointSet(observe_X)
...
bcs =[...,
BC_measurement = dde.OperatorBC(geomtime, lambda x, y, _: func_y1dotdot_measured(x, y, observe_X), lambda x, _: ptset.inside(x)) ]
...
data = dde.data.TimePDE(
  geomtime,
  pde,
  bcs + ics,
  num_domain=4000,
  num_boundary=200,
  num_initial=20,
  num_test=100,
  anchors=observe_X,
  train_distribution="uniform",
)
layer_size = [2] + [50]*2+ [2]     # [x,t]->[y1,y2]   
....

,where Y1dotdot_Meas is a placeholder for the values y1''_observe of the measurements [X_observe, y1''_observe]. I don't think this works as the shapes of Y1dotdot_Meas and dy1dtt(x,y) are not compatible.

Is this approach valid? Unfortunately, I did not find any similar issues in the FAQ. As such, I'd greatly appreciate your feedback! Thank you very much!

DavidBraun97 commented 3 years ago

In the meantime, inspired by your class PointSetBC I came up with a different solution. It succeeds training without throwing any errors. However, I have not tested if it actually works as intended.

class PointSetBCDotDot(object):
    """Boundary condition for a set of points.
    Compare the second derivative of the output (that associates with `points`) with `values` (target data).
    Args:
        points: An array of points where the corresponding target values are known and used for training.
        values: An array of values that gives the exact solution of the problem.
        component: The output component satisfying this BC.
    """
    def __init__(self, points, values, component):
        self.points = np.array(points)
        if not isinstance(values, numbers.Number) and values.shape[1] != 1:
            raise RuntimeError(
                "PointSetBC should output 1D values. Use argument 'component' for different components."
            )
        self.values = values
        self.component = component

    def second_normal_derivative(self, X, inputs, outputs, beg, end):
        dydtt = grad.hessian(outputs, inputs, i=self.component, j=1)[beg:end]
        return tf.reduce_sum(dydtt, axis=1, keepdims=True)
    def collocation_points(self, X):
        return self.points

    def error(self, X, inputs, outputs, beg, end):
        return self.second_normal_derivative(X, inputs, outputs, beg, end) - self.values

From here I need only one line:

bcs =[...,
BC_measurement = PointSetBCDotDot(observe_X, func_y1dotdot_measured(observe_X), component=1)   ]         

Where func_y1dotdot_measured provides the target values of y1'' given the discrete set observe_X.

Do you think this alternative approach is valid? As above way of incorporating measurements of accelerations [X_observe, y1''_observe] (output y1 and y2 correspond to deflections in y and z direction) is not that straight forward, I am still very interested in the "correct" way using OperatorBC. Thank you very much!

lululxvi commented 3 years ago

Both approaches seems OK. For the first approach, you may need PointSet.values_to_func to convert the point value to function.

DavidBraun97 commented 3 years ago

Hi @lululxvi Thanks for your answer, much appreciated!

Throughout the past few days I have been trying to solve an inverse problem with 2nd derivative measurements (in this case accelerations). For reference, please find a more detailed description of my problem below.

### Measurements (up to 10 kHz) ###
num_meas = 500
x1_meas = 0
x2_meas = 0.2
x3_meas = 0.4
x4_meas = 0.6
x5_meas = 0.8
x6_meas = 1
observe_x1 = x1_meas*np.ones(num_meas)
observe_x2 = x2_meas*np.ones(num_meas)
observe_x3 = x3_meas*np.ones(num_meas)
observe_x4 = x4_meas*np.ones(num_meas)
observe_x5 = x5_meas*np.ones(num_meas)
observe_x6 = x6_meas*np.ones(num_meas)
observe_t = np.vstack(np.linspace(T_init, T_sim, num=num_meas))
observe_X1 = np.column_stack((observe_x1, observe_t))
observe_X2 = np.column_stack((observe_x2, observe_t))
observe_X3 = np.column_stack((observe_x3, observe_t))
observe_X4 = np.column_stack((observe_x4, observe_t))
observe_X5 = np.column_stack((observe_x5, observe_t))
observe_X6 = np.column_stack((observe_x6, observe_t))
observe_X = np.vstack((observe_X1, observe_X2, observe_X3, observe_X4, observe_X5, observe_X6))
time_stamps = ["t=0s","t=0.2s","t=0.4s","t=0.6s"]  

ptset = dde.bc.PointSet(observe_X)

### unknown parameter ###
U_learn = tf.Variable(0.002, dtype = tf.float32)

### Helperclass for applying acceleration measurements ###

class PointSetBCDotDot(object):
    """Boundary condition for a set of points.
    Compare the second derivative of the output (that associates with `points`) with `values` (target data).
    Args:
        points: An array of points where the corresponding target values are known and used for training.
        values: An array of values that gives the exact solution of the problem.
        component: The output component satisfying this BC.
    """
    def __init__(self, points, values, component):
        self.points = np.array(points)
        if not isinstance(values, numbers.Number) and values.shape[1] != 1:
            raise RuntimeError(
                "PointSetBC should output 1D values. Use argument 'component' for different components."
            )
        self.values = values
        self.component = component

    def second_normal_derivative(self, X, inputs, outputs, beg, end):
        #dydt = grad.jacobian(outputs, inputs, i=self.component, j=1)[beg:end]
        #dydtt = grad.jacobian(dydt, inputs, i=self.component, j=1)[beg:end]
        dydtt = grad.hessian(outputs, inputs, i=self.component, j=1)[beg:end]
        #n = np.array(list(map(self.geom.boundary_normal, X[beg:end])))
        return tf.reduce_sum(dydtt, axis=1, keepdims=True)

    def collocation_points(self, X):
        return self.points

    def error(self, X, inputs, outputs, beg, end):
        return self.second_normal_derivative(X, inputs, outputs, beg, end) - self.values

### PDE stuff ###
def dy1dt(x,y):
  return dde.grad.jacobian(y, x, i=0, j=1)
def dy1dtt(x,y):
  return tf.gradients(dy1dt(x,y), x)[0][:, :1]
def dy1dxx(x,y):
  return dde.grad.hessian(y, x, component=0, i=0, j=0)

def dy2dt(x,y):
  return dde.grad.jacobian(y, x, i=1, j=1)
def dy2dtt(x,y):
  return tf.gradients(dy2dt(x,y), x)[0][:, 1:]
def dy2dxx(x,y):
  return dde.grad.hessian(y, x, component=1, i=0, j=0)

# force distribution
def q_y1(x,t):
  return -U_learn*w**2*tf.math.cos(w*t)*(tf.math.exp(-0.5*(x - a)**2/sigma**2)) / ((2*tf.constant(np.pi, dtype = tf.float32)*sigma**2)**0.5)
def q_y2(x,t):
  return -U_learn*w**2*tf.math.sin(w*t)*(tf.math.exp(-0.5*(x - a)**2/sigma**2)) / ((2*tf.constant(np.pi, dtype = tf.float32)*sigma**2)**0.5)

# PDEs
def pde_y1(x, y):
  dy1_xt = tf.gradients(y[:, :1], x)[0]
  dy1_x,dy1_t = dy1_xt[:, :1], dy1_xt[:, 1:]

  dy1_xx = tf.gradients(dy1_x, x)[0][:, :1]
  dy1_tt = tf.gradients(dy1_t, x)[0][:, :1]

  dy1_xxx = tf.gradients(dy1_xx, x)[0][:, :1]
  dy1_xxxx = tf.gradients(dy1_xxx, x)[0][:, :1]
  return (E*I)/(rho*A)*dy1_xxxx + dy1_tt - ((D/2)**2/4)*(dy1_xx*dy1_tt+2*w*dy1_xx*dy1_t)+ c/(rho*A) * dy1_t  - q_y1(x[:, :1],x[:, 1:])  

def pde_y2(x, y):
  dy2_xt = tf.gradients(y[:, 1:], x)[0]
  dy2_x,dy2_t = dy2_xt[:, :1], dy2_xt[:, 1:]

  dy2_xx = tf.gradients(dy2_x, x)[0][:, 1:]
  dy2_tt = tf.gradients(dy2_t, x)[0][:, 1:]

  dy2_xxx = tf.gradients(dy2_xx, x)[0][:, 1:]
  dy2_xxxx = tf.gradients(dy2_xxx, x)[0][:, 1:]
  return (E*I)/(rho*A)*dy2_xxxx + dy2_tt - ((D/2)**2/4)*(dy2_xx*dy2_tt+2*w*dy2_xx*dy2_t)+ c/(rho*A) * dy2_t  - q_y2(x[:, :1],x[:, 1:])  

def pde(x,y):
  return [pde_y1(x,y), pde_y2(x,y)]

### Mapping of acceleration measurements (obtained by FEM) ###
def func_y1dotdot(x):
  x, t = x[:, :1], x[:, 1:]    
  y1dotdot_meas = np.zeros((len(x),1))
  for xi in range(len(x)):
    idxy_FEM = [0,4,8,12,16,20,24,28,32,36,40]
    idx_Glob = [0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1]
    for l, n in zip(idxy_FEM, idx_Glob):
      if abs(x[xi] - n) <= 0.01:
        idx_i = l

    idxt_FEM = np.linspace(T_init, T_sim, num=num_meas)
    idxt_Glob = np.arange(num_meas)
    for ind_t_FEM, ind_t_GLOB in zip(idxt_FEM, idxt_Glob):
      if abs(t[xi] - ind_t_FEM) <= (T_sim-T_init)/num_meas:
        idx_t = ind_t_GLOB
    y1dotdot_meas[xi] = accelerations[idx_i,idx_t]
    del idx_i,idx_t
  return y1dotdot_meas

def func_y2dotdot(x):
  x, t = x[:, :1], x[:, 1:]    
  y2dotdot_meas = np.zeros((len(x),1))
  for xi in range(len(x)):
    idxz_FEM = [1,5,9,13,17,21,25,29,33,37,41]
    idx_Glob = [0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1]
    for l, n in zip(idxz_FEM, idx_Glob):
      if abs(x[xi] - n) <= 0.01:
        idx_i = l

    idxt_FEM = np.linspace(T_init, T_sim, num=num_meas)
    idxt_Glob = np.arange(num_meas)
    for ind_t_FEM, ind_t_GLOB in zip(idxt_FEM, idxt_Glob):
      if abs(t[xi] - ind_t_FEM) <= (T_sim-T_init)/num_meas:
        idx_t = ind_t_GLOB
    y2dotdot_meas[xi] = accelerations[idx_i,idx_t]
    del idx_i,idx_t
  return y2dotdot_meas

### MODELLING stuff ###
geom = dde.geometry.Interval(0, L)
timedomain = dde.geometry.TimeDomain(T_init, T_sim)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

def boundary_initial(x, _):
  return np.isclose(x[-1], 0)

bcs = [
  # "Physical" BC
  dde.DirichletBC(geomtime, lambda x: 0,lambda _, on_boundary: on_boundary, component=0),                              # y(0&1,t) = 0 
  dde.DirichletBC(geomtime, lambda x: 0,lambda _, on_boundary: on_boundary, component=1),                              # z(0&1,t) = 0
  dde.OperatorBC(geomtime, lambda x, y, _: dy1dxx(x,y), lambda _, on_boundary: on_boundary),                           # dy_dxx(0&1,t) = 0 
  dde.OperatorBC(geomtime, lambda x, y, _: dy2dxx(x,y), lambda _, on_boundary: on_boundary),                           # dz_dxx(0&1,t) = 0
  # Measurements BC
  PointSetBCDotDot(observe_X, func_y1dotdot(observe_X) , component=0),                                                 # y''(x,t)=y''_meas 
  PointSetBCDotDot(observe_X, func_y2dotdot(observe_X) , component=1)                                                  # z''(x,t)=z''_meas 
  ]

data = dde.data.TimePDE(
  geomtime,
  pde,
  bcs,
  num_domain=4000,
  num_boundary=200,
  num_initial=20,
  num_test=100,
  anchors=observe_X,
  train_distribution="uniform",
)
layer_size = [2] + [50]*2+ [2]     # [x,t]->[y1,y2]   
activation = "tanh"
initializer = "Glorot uniform"
net = dde.maps.FNN(layer_size, activation,initializer)

model = dde.Model(data, net)

# callbacks for storing results
fnamevar = "variables.dat"
variable = dde.callbacks.VariableValue([U_learn], period=1, filename=fnamevar)

Unfortunately I am not getting any learning progress so far. It appears that this (i.e. the solution of an inverse problem applying derivative measurements) is a non-trivial task, compared to the rather "easy" case when we apply measurements of the PINNs output y (i.e. deflections) directly.

Do you have any experience with this usecase? Do you think it could help to first "pretrain" a model with deflection measurements and then switch to accelerations? I'd really appreciate any feedback.

Thank you very much. Cheers, David

lululxvi commented 3 years ago

If your code is correct, then one thing you should try is to use different weights for different loss terms. In my intuition, the weight for the second-order derivative should be samll.

haison19952013 commented 2 years ago

Hi professor @lululxvi, Hi @DavidBraun97 , thanks for your contribution. Recently, I have faced the same problem. My given data is the pair of X and the first-derivative measurement of y. Therefore, I modified your code (i.e., the class PointSetBCDotDot) a little bit so that it can fit my problem. However, during the training, I see that there is no learning for that derivative terms (i.e., the losses remains unchanged). Do you face a similar problem?

FZUcipher commented 2 years ago

Hi, @haison19952013 , Is it convenient for you to share how to modify it?Thanks!

haison19952013 commented 2 years ago

Hi, @haison19952013 , Is it convenient for you to share how to modify it?Thanks!

Yes sure, actually, I modified the PointSetBCDotDot of @DavidBraun97 and changed the name into PointSetBCDot. This time I'm quite busy so I haven't verified this code yet. If you find out any problem, please tell me. The code is given below:

class PointSetBCDot(object):
    """Boundary condition for a set of points.
    Compare the first derivative of the output (that associates with `points`) with `values` (target data).
    Args:
        points: An array of points where the corresponding target values are known and used for training.
        values: An array of values that gives the exact solution of the problem.
        component: The output component satisfying this BC.
    """

    def __init__(self, points, values, component, j):
        self.points = np.array(points)
        if not isinstance(values, numbers.Number) and values.shape[1] != 1:
            raise RuntimeError(
                "PointSetBC should output 1D values. Use argument 'component' for different components."
            )
        self.j = j
        self.values = values
        self.component = component

    def first_normal_derivative(self, X, inputs, outputs, beg, end):
        dydt = dde.grad.jacobian(outputs, inputs, i=self.component, j=self.j)[beg:end]
        # dydtt = grad.jacobian(dydt, inputs, i=self.component, j=1)[beg:end]
        # dydtt = grad.hessian(outputs, inputs, i=self.component, j=1)[beg:end]
        # n = np.array(list(map(self.geom.boundary_normal, X[beg:end])))
        return tf.reduce_sum(dydt, axis=1, keepdims=True)

    def collocation_points(self, X):
        return self.points

    def error(self, X, inputs, outputs, beg, end):
        return self.first_normal_derivative(X, inputs, outputs, beg, end) - tf.constant(self.values)
forxltk commented 2 years ago

@haison19952013

Instead of modifying the PointSetBC, could you try an additional output y=dy/dx so that it will be a Dirichlet BC?

haison19952013 commented 2 years ago

@haison19952013

Instead of modifying the PointSetBC, could you try an additional output y=dy/dx so that it will be a Dirichlet BC?

I understand your point. But here, I want to utilize the AD to estimate the first derivative of the output and compare the estimated derivative to the ground truth. If we add additional output, the model is more or less similar to conventional deep learning.

forxltk commented 2 years ago

@haison19952013 Instead of modifying the PointSetBC, could you try an additional output y=dy/dx so that it will be a Dirichlet BC?

I understand your point. But here, I want to utilize the AD to estimate the first derivative of the output and compare the estimated derivative to the ground truth. If we add additional output, the model is more or less similar to conventional deep learning.

No, I mean you can use the following pde:

def pde(x, y):
    main_pde=...
    extra_output = y[:, 1:2]-dde.grad.jacobian(i=0, j=0)
    return [main_pde, extra_output]

Then, the second output of your model will be dydx estimated by the AD. You can use PointSetBC directly without any modification to compare dydx with the ground truth.

haison19952013 commented 2 years ago

@haison19952013 Instead of modifying the PointSetBC, could you try an additional output y=dy/dx so that it will be a Dirichlet BC?

I understand your point. But here, I want to utilize the AD to estimate the first derivative of the output and compare the estimated derivative to the ground truth. If we add additional output, the model is more or less similar to conventional deep learning.

No, I mean you can use the following pde:

def pde(x, y):
    main_pde=...
    extra_output = y[:, 1:2]-dde.grad.jacobian(i=0, j=0)
    return [main_pde, extra_output]

Then, the second output of your model will be dydx estimated by the AD. You can use PointSetBC directly without any modification to compare dydx with the ground truth.

Hmmm..I see, if that, you will add one more loss function to the training...I'm not sure whether your approach is good or not. Do you have any investigation on these approaches (the one with modified PointSetBC and the one without modified PointSetBC)?

forxltk commented 2 years ago

@haison19952013 Instead of modifying the PointSetBC, could you try an additional output y=dy/dx so that it will be a Dirichlet BC?

I understand your point. But here, I want to utilize the AD to estimate the first derivative of the output and compare the estimated derivative to the ground truth. If we add additional output, the model is more or less similar to conventional deep learning.

No, I mean you can use the following pde:

def pde(x, y):
    main_pde=...
    extra_output = y[:, 1:2]-dde.grad.jacobian(i=0, j=0)
    return [main_pde, extra_output]

Then, the second output of your model will be dydx estimated by the AD. You can use PointSetBC directly without any modification to compare dydx with the ground truth.

Hmmm..I see, if that, you will add one more loss function to the training...I'm not sure whether your approach is good or not. Do you have any investigation on these approaches (the one with modified PointSetBC and the one without modified PointSetBC)?

No, I haven't. But to avoid additional loss, you can use net.apply_output_transform, then add the first-derivative by tf.concat. Since your losses remains unchanged, I am not sure whether your modification or the implementation is wrong. You can verify your PointSetBCDot by different approach.

haison19952013 commented 2 years ago

@haison19952013 Instead of modifying the PointSetBC, could you try an additional output y=dy/dx so that it will be a Dirichlet BC?

I understand your point. But here, I want to utilize the AD to estimate the first derivative of the output and compare the estimated derivative to the ground truth. If we add additional output, the model is more or less similar to conventional deep learning.

No, I mean you can use the following pde:

def pde(x, y):
    main_pde=...
    extra_output = y[:, 1:2]-dde.grad.jacobian(i=0, j=0)
    return [main_pde, extra_output]

Then, the second output of your model will be dydx estimated by the AD. You can use PointSetBC directly without any modification to compare dydx with the ground truth.

Hmmm..I see, if that, you will add one more loss function to the training...I'm not sure whether your approach is good or not. Do you have any investigation on these approaches (the one with modified PointSetBC and the one without modified PointSetBC)?

No, I haven't. But to avoid additional loss, you can use net.apply_output_transform, then add the first-derivative by tf.concat. Since your losses remains unchanged, I am not sure whether your modification or the implementation is wrong. You can verify your PointSetBCDot by different approach.

I see, do you have any working problems (i.e., implementation) based on your approach?

forxltk commented 2 years ago

@haison19952013 Here I use PointSetBC instead of NeumannBC for Euler beam example. The l2 relative error is 1.24e-03. I try your PointSetBCDot, and the result is good as well.

"""Backend supported: tensorflow.compat.v1, tensorflow, pytorch"""
import deepxde as dde
import numpy as np
import tensorflow as tf

def ddy(x, y):
    return dde.grad.hessian(y, x, component=0)

def dddy(x, y):
    return dde.grad.jacobian(ddy(x, y), x)

def pde(x, y):
    dy_xx = ddy(x, y)
    dy_xxxx = dde.grad.hessian(dy_xx, x)

    return [dy_xxxx + 1]

def boundary_l(x, on_boundary):
    return on_boundary and np.isclose(x[0], 0)

def boundary_r(x, on_boundary):
    return on_boundary and np.isclose(x[0], 1)

def func(x):
    return -(x ** 4) / 24 + x ** 3 / 6 - x ** 2 / 4

geom = dde.geometry.Interval(0, 1)
observe_x = np.array([[0.], [0.]])
observe_y = func(observe_x)

bc1 = dde.icbc.DirichletBC(geom, lambda x: 0, boundary_l, component=0)
#bc2 = dde.icbc.NeumannBC(geom, lambda x: 0, boundary_l)
bc2 = dde.icbc.PointSetBC(observe_x, observe_y, component=1)
bc3 = dde.icbc.OperatorBC(geom, lambda x, y, _: ddy(x, y), boundary_r)
bc4 = dde.icbc.OperatorBC(geom, lambda x, y, _: dddy(x, y), boundary_r)

data = dde.data.PDE(
    geom,
    pde,
    [bc1, bc2, bc3, bc4],
    num_domain=10,
    num_boundary=2,
    #solution=func,
    num_test=100,
)
layer_size = [1] + [20] * 3 + [1]
activation = "tanh"
initializer = "Glorot uniform"
net = dde.nn.FNN(layer_size, activation, initializer)

def modify_output(X, Y):
    dydx = tf.gradients(Y, X)[0]
    final_output = tf.concat([Y, dydx], axis=1)
    return final_output

net.apply_output_transform(modify_output)

model = dde.Model(data, net)
model.compile("adam", lr=0.001)#, metrics=["l2 relative error"])

losshistory, train_state = model.train(epochs=10000)
dde.saveplot(losshistory, train_state, issave=True, isplot=True)

X = np.linspace(0, 1, 100)
X = X.reshape(100, 1)
y_pred = model.predict(X)[:, 0:1]
y_true = func(X)

# Errors
y_error = dde.metrics.l2_relative_error(y_true, y_pred)
print('Relative L2 error_y: {:.2e}'.format(y_error))