lululxvi / deepxde

A library for scientific machine learning and physics-informed learning
https://deepxde.readthedocs.io
GNU Lesser General Public License v2.1
2.71k stars 753 forks source link

Evaluating a model using another models output and using apply_output_transform to create multiple models #355

Closed CarlArblue closed 2 years ago

CarlArblue commented 3 years ago

Hello Mr. Lulu,

Is it possible in DeepXDE to evaluate our model during training time with another models output? If we have two models zeta and phi can we evaluate phi at zeta? In Tensorflow we can go about this by zeta = zeta_model([x_tf,t_tf]), phi=phi_model([x_tf,zeta,t_tf]).

Also how would one go about creating two models or separate dense layers using apply_output_transform?

Say we have our model as:

FNN_layer = [3] + [60] * 1 + [2] net = dde.maps.FNN(FNN_layer, "tanh", "Glorot uniform") net.apply_output_transform(?) model = dde.Model(data, net)

Is it correct that our apply_output_transform should be:

def apply_output_transform(input, outputs): x = input[:,0:1]
z = input[:,1:2]
t = input[:,2:3]

x_t = tf.concat([x,t])
x_z_t = tf.concat([x,z,t])

# ZETA #

zeta_layer_size = 40

zeta = tf.layers.Dense(x_t, zeta_layer_size, tf.nn.tanh)
zeta = tf.layers.Dense(zeta, zeta_layer_size, tf.nn.tanh)
zeta_output = tf.layers.Dense(zeta, 1, None)

zeta_concat = tf.concat([x_z_t, zeta_output])

# PHI #

phi_layer_size = 80

phi = tf.layers.Dense(zeta_concat, phi_layer_size, tf.nn.tanh)
phi = tf.layers.Dense(phi, phi_layer_size, tf.nn.tanh)
phi = tf.layers.Dense(phi, phi_layer_size, tf.nn.tanh)
phi_output = tf.layers.Dense(phi, 1, None)

final_output = tf.concat([zeta_output, phi_output], axis=1)

return final_output

Would this be the correct way to go about it? Or are we missing something.

Thanks!

lululxvi commented 3 years ago

Yes, you are correct. Because you will construct the network by yourself, you may use

FNN_layer = [3, 2]
CarlArblue commented 3 years ago

Thank you. I got it to work!

LanPeng-94 commented 3 years ago

Dear CarlArblue and Lu: I'm sorry to disturb you. I have tried to apply the code you posted to a simple problem, but it has not worked, can you give me any advice? I want to identify K code as follows:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np

import deepxde as dde
from deepxde.backend import tf
def main():
    def pde(x, y):
        u, K = y[:, 0:1], y[:, 1:2]
        du_x, du_t = tf.gradients(u, x)[0][:, 0:1], tf.gradients(u, x)[0][:, 1:2]
        du_xx = tf.gradients(du_x, x)[0][:, 0:1]
        W = -2*x[:, 1:2]**2 - (-1 + x[:, 0:1])*x[:, 0:1] + 2*x[:, 1:2]*(1 - x[:, 0:1] + x[:, 0:1]**2)
        eq = du_t-du_xx-W
        return eq

    def func1(x):
        return x[:, 0:1] * x[:, 1:2]*(1 - x[:, 0:1]) * (1 - x[:, 1:2])

    def func2(x):
        return x[:, 0:1] * x[:, 1:2]*(1 - x[:, 0:1]) * (1 - x[:, 1:2])

    def func(x):
        return np.hstack((x[:, 0:1] * x[:, 1:2]*(1 - x[:, 0:1]) * (1 - x[:, 1:2]), x[:, 0:1] * x[:, 1:2]*(1 - x[:, 0:1]) * (1 - x[:, 1:2])))

    geom = dde.geometry.Interval(0, 1)
    timedomain = dde.geometry.TimeDomain(0, 1)
    geomtime = dde.geometry.GeometryXTime(geom, timedomain)

    def boundary_l(x, on_boundary):
        return on_boundary and np.isclose(x[0], 0)

    def boundary_r(x, on_boundary):
        return on_boundary and np.isclose(x[0], 1)

    bc1 = dde.DirichletBC(geomtime, func1, boundary_l, component=0)
    bc2 = dde.DirichletBC(geomtime, func1, boundary_r, component=0)
    ic = dde.IC(geomtime, func1, lambda _, on_initial: on_initial, component=0)

    x = np.linspace(0, 1, num=20)
    t = np.linspace(0, 1, num=20)
    X, T = np.meshgrid(x, t)
    observe_x = np.vstack((X.flatten(), T.flatten())).T
    ob_y = func1(observe_x)
    observe_y = dde.PointSetBC(observe_x, ob_y)

    data = dde.data.PDE(
        geomtime,
        pde,
        [bc1, bc2, ic, observe_y],  #
        # [observe_y],  #
        num_domain=500,
        num_boundary=200,
        anchors=observe_x,
        solution=func,
        # num_test=1000,
    )

    net = dde.maps.FNN([2] + [150] * 3 + [2], "tanh", "Glorot uniform")

    def modify_output(X, y):
        x, t = X[:, 0:1], X[:, 1:2]
        x_t = tf.concat([x, t], axis=1)
        # u, K = y[:, 0:1], y[:, 1:2]
        # u #
        u_layer_size = 60
        u = tf.layers.dense(x_t, u_layer_size, tf.nn.tanh)
        u = tf.layers.dense(u, u_layer_size, tf.nn.tanh)
        u = tf.layers.dense(u, u_layer_size, tf.nn.tanh)
        u = tf.layers.dense(u, u_layer_size, tf.nn.tanh)
        u = tf.layers.dense(u, u_layer_size, tf.nn.tanh)
        u = tf.layers.dense(u, u_layer_size, tf.nn.tanh)
        u_output = tf.layers.dense(u, 1, None)
        u_concat = tf.concat([x_t, u_output], axis=1)

        # K #
        K_layer_size = 60
        K = tf.layers.dense(u_concat, K_layer_size, tf.nn.tanh)
        K = tf.layers.dense(K, K_layer_size, tf.nn.tanh)
        K = tf.layers.dense(K, K_layer_size, tf.nn.tanh)
        K = tf.layers.dense(K, K_layer_size, tf.nn.tanh)
        K = tf.layers.dense(K, K_layer_size, tf.nn.tanh)
        K = tf.layers.dense(K, K_layer_size, tf.nn.tanh)
        K_output = tf.layers.dense(K, 1, None)
        final_output = tf.concat([u_output, K_output], axis=1)
        return final_output

    net.apply_output_transform(modify_output)

    model = dde.Model(data, net)

    model.compile("adam", lr=1e-3, metrics=["l2 relative error"])
    losshistory, train_state = model.train(epochs=20000)
    model.compile("L-BFGS-B", metrics=["l2 relative error"])
    losshistory, train_state = model.train()
    dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if __name__ == "__main__":
    main()
lululxvi commented 3 years ago

What do you mean by "not worked"?

AlirezaPa commented 2 years ago

Thank you @lululxvi for introducing this page to me. I have a short inverse code based on mixed formulation of the 1D-wave equation. Currently, wave velocity C is a function of (x,t), and the code technically works, but when I added the "def modify_output(x, y)" (###New Part###) to convert C(x,t) to C(x) similar the procedure that @979736316 and @CarlArblue did for their codes, I got an error message that I could not fix it. Below is my code and the error message that I relieved after adding "def modify_output(x, y)". I appreciate your help.

import deepxde as dde import numpy as np from deepxde.backend import tf

L_domain = 1

def pde(x, y): u, sigma, C = y[:,0:1], y[:,1:2], y[:,2:3] dsigma_x = dde.grad.jacobian(y, x, i=1, j=0) du_tt = dde.grad.hessian(y, x, component=0, i=1, j=1) du_x = dde.grad.jacobian(y, x, i=0, j=0) return [ dsigma_x - du_tt, sigma - (C * 2)du_x ]

def boundary_l(x, on_boundary): return on_boundary and np.isclose(x[0], 0)

geom = dde.geometry.Interval(0, L_domain) timedomain = dde.geometry.TimeDomain(0, 5) geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, lambda x: -np.sin(x[:, 1:]), boundary_l, component=1) ic1 = dde.IC(geomtime, lambda x: 0, lambda , on_initial: on_initial, component=0) ic2 = dde.OperatorBC(geomtime, lambda x, y, : dde.grad.jacobian(y, x, i=0, j=1), lambda x, _: np.isclose(x[1], 0))

observe_x = np.vstack((np.full((5001), 0.00), np.linspace(0, 5, num=5001))).T data_exact = np.loadtxt('/content/gdrive/MyDrive/Colab Notebooks/Thesis-Examples/sint-c1-5-5s-20m.txt') u_ex = data_exact[:,1] u_ex1 = u_ex.reshape(5001,1) observe_y = dde.icbc.PointSetBC(observe_x, u_ex1, component=0)

data = dde.data.TimePDE( geomtime, pde, [bc, ic_1, ic_2, observe_y], num_domain=1000, num_boundary=300, num_initial=300, anchors=observe_x,
num_test=10000, )

net = dde.nn.FNN([2] + [10, 10, 10] * 3 + [3], "tanh", "Glorot normal")

####################### New Part ################################### def modify_output(x, y):
C_output = tf.layers.dense(x[:,0:1], 10, tf.nn.tanh) C_output = tf.layers.dense(C_output, 10, tf.nn.tanh) C_output = tf.layers.dense(C_output, 10, tf.nn.tanh) C_output = tf.layers.dense(C_output, 1, None) final_output = tf.concat([y[:, 2:3], C_output], axis=1) return final_output net.apply_output_transform(modify_output) ###################################################################

model = dde.Model(data, net) model.compile("adam", lr=0.001) losshistory, train_state = model.train(epochs=10000) dde.saveplot(losshistory, train_state, issave=True, isplot=True)

image image

forxltk commented 2 years ago

Could you try final_output = tf.concat([y[:, 0:1], y[:, 1:2], C_output], axis=1)

AlirezaPa commented 2 years ago

Thank you very much @forxltk now C is just a function of x. But the values of C is not good. However the results of u is great. I used these conditions: (the beginning of the code is as above in the previous comment)

data = dde.data.TimePDE( geomtime, pde, [bc_2, ic_1, ic_2, observe_y], num_domain=2000, num_boundary=1000, num_initial=500, anchors=observe_x,
num_test=10000, )

net = dde.nn.FNN([2] + [30, 30, 30] * 3 + [3], "tanh", "Glorot normal")

def modify_output(x, y):
C_output = tf.layers.dense(x[:,0:1], 30, tf.nn.tanh) C_output = tf.layers.dense(C_output, 30, tf.nn.tanh) C_output = tf.layers.dense(C_output, 30, tf.nn.tanh) C_output = tf.layers.dense(C_output, 1, None) final_output = tf.concat([y[:, 0:1], y[:, 1:2], C_output], axis=1) return final_output net.apply_output_transform(modify_output) model = dde.Model(data, net)

model.compile("L-BFGS")

losshistory, train_state = model.train() dde.saveplot(losshistory, train_state, issave=True, isplot=True)

Th following figure is the result of u (y[:, 0:1]) which is great. But all the values of C(x) is between 8 and 8.4, but the true solutions are C(x)=1 for 0<x<0.4 and C(x)=5 for 0.4<x<1 (P.N: The code works well and can identify C(x) when the problem is homogeneous with the true solutions C(x)=1 for 0<x<1)

image

Do you have any suggestions to improve C(x)?

forxltk commented 2 years ago

The identification of your C(x) could be tricky. What about the performance of the code that C(x)=1 for 0<x<0.4 and C(x)=2 for 0.4<x<1. The predicted value 8~8.4 deviates too much from the true value that the network is not work well.

AlirezaPa commented 2 years ago

Thank you @forxltk . When I used C(x)=1 for all 0<x<1 the code works well and the predicted values are 0.998 to1.001. But when I used two layers C(x)=1 and C(x)=2 in the domain, the predicted values are too bad such as 8 to 8.4 or 7 to 7.3. Do you think PINN is a good way to predict C as vector, or is it good to just predict scalar values? Any suggestions to modify my current code?

forxltk commented 2 years ago

I haven't had a similar problem yet. Maybe you should have a look at the following example https://github.com/lululxvi/deepxde/blob/master/examples/pinn_inverse/diffusion_reaction_rate.py

AlirezaPa commented 2 years ago

@forxltk I got the following result for C(x)=1 and C(x)=2 with these conditions. I increased the layers of C_output network from 3 to 5 and changed the initialization from Glorot normal to Glorot uniform.

data = dde.data.TimePDE( geomtime, pde, [bc_2, ic_1, ic_2, observe_y], num_domain=2000, num_boundary=500, num_initial=200, anchors=observe_x,
num_test=10000, )

net = dde.nn.FNN([2] + [10, 10, 10] * 2 + [3], "tanh", "Glorot uniform")

def modify_output(x, y):
C_output = tf.layers.dense(x[:,0:1], 20, tf.nn.tanh) C_output = tf.layers.dense(C_output, 20, tf.nn.tanh) C_output = tf.layers.dense(C_output, 20, tf.nn.tanh) C_output = tf.layers.dense(C_output, 20, tf.nn.tanh) C_output = tf.layers.dense(C_output, 20, tf.nn.tanh) C_output = tf.layers.dense(C_output, 1, None) final_output = tf.concat([y[:, 0:1], y[:, 1:2], C_output], axis=1) return final_output

net.apply_output_transform(modify_output)

model = dde.Model(data, net) model.compile("L-BFGS")

image

Do you have any idea for other hyper-parameters to improve the results?

forxltk commented 2 years ago

I am afraid I haven't. As I said before, identifying non-continuous functions is somewhat difficult.

lululxvi commented 2 years ago

@AlirezaPa I would suggest first trying a continuous C to test the code.

AlirezaPa commented 2 years ago

Thank you @lululxvi and @forxltk . Actually, I changed my benchmark problems and backed again to simple Inverse wave with Dirichlete boundary condition. I got some good results with continuous C then I just tried to use hard initial and boundary conditions for the following inverse problem:

image

import deepxde as dde import numpy as np from deepxde.backend import tf

def pde(x, y): u, C = y[:, 0:1], y[:, 1:2] du_tt = dde.grad.hessian(u, x, i=1, j=1) du_x = dde.grad.jacobian(u, x, j=0) du_xx = dde.grad.jacobian((C * 2)du_x, x, j=0)
return du_tt - du_xx

geom = dde.geometry.Interval(0, 1) timedomain = dde.geometry.TimeDomain(0, 1) geomtime = dde.geometry.GeometryXTime(geom, timedomain)

observe_x = np.vstack((np.full((1001), 0.5), np.linspace(0, 1, num=1001))).T

data_exact = np.loadtxt('/content/gdrive/MyDrive/Colab Notebooks/Thesis-Examples/bothends-sinpit-smoothly-1s-x05.txt') u_ex = data_exact[:,1] u_ex1 = u_ex.reshape(1001,1)

observe_u = dde.icbc.PointSetBC(observe_x, u_ex1, component=0)

data = dde.data.TimePDE(geomtime, pde, [observe_u], num_domain=1000, anchors=observe_x, num_test=10000)

net = dde.nn.PFNN([2] + [20, 20] * 2 + [2], "tanh", "Glorot uniform")

def modify_output(x, y):
C_output = tf.layers.dense(x[:,0:1], 20, tf.nn.tanh) C_output = tf.layers.dense(C_output, 20, tf.nn.tanh)
C_output = tf.layers.dense(C_output, 1, None) u_output = x[:, 1:2] x[:, 1:2] x[:, 0:1] (1 - x[:, 0:1]) y[:, 0:1] + tf.sin(np.pi * x[:, 0:1]) final_output = tf.concat([u_output, C_output], axis=1) return final_output

net.apply_output_transform(modify_output)

model = dde.Model(data, net)

model.compile("L-BFGS")

losshistory, train_state = model.train() dde.saveplot(losshistory, train_state, issave=True, isplot=False)

I got the good results with soft I.C and B.C, but using this modified code for hard I.C and B.C and the same network architecture and same number of training data the results are very bad. Do you think this code is correct? The hard condition is "u_output" in def modify_output(x, y)

forxltk commented 2 years ago
  1. There are double t in u_output.
  2. du_t is not your NN output, where a soft BC is still required.
AlirezaPa commented 2 years ago

Thanks @forxltk . I used double t in u_output to satisfy du_t(x,0)=0. Is it wrong?

forxltk commented 2 years ago

OK, I see. The hard BC seems good. Maybe the transform make your output not O(1)? What about using hard BC only for DirichletBC so that you can find where is the problem?

LanPeng-94 commented 2 years ago

OK, I see. The hard BC seems good. Maybe the transform make your output not O(1)? What about using hard BC only for DirichletBC so that you can find where is the problem?

Hi I applied a hard constraint to the inversion of the permeability coefficients of the Darcy flow, and the results are as follows. It can be seen that the hard constraint gives satisfactory results, however, the PINNs without the hard constraint applied to fail. image

AlirezaPa commented 2 years ago

OK, I see. The hard BC seems good. Maybe the transform make your output not O(1)? What about using hard BC only for DirichletBC so that you can find where is the problem?

Thank you. I got the good results with hard BC only as you suggested. I think hard conditions does not work for du_t(x,0)=0, however, Lagaris (1998) proposed the formula for that.

AlirezaPa commented 2 years ago

OK, I see. The hard BC seems good. Maybe the transform make your output not O(1)? What about using hard BC only for DirichletBC so that you can find where is the problem?

Hi I applied a hard constraint to the inversion of the permeability coefficients of the Darcy flow, and the results are as follows. It can be seen that the hard constraint gives satisfactory results, however, the PINNs without the hard constraint applied to fail. image

Nice. Is your hard constraint just for Dirichlet boundary conditions?

LanPeng-94 commented 2 years ago

Yes. In this study, the equation's boundary condition is Dirichlet's boundary condition.

AlirezaPa commented 2 years ago

Hi @forxltk . Is there any way to define initial condition such as u(x,0)=1 just for x=0.25, 0.5, 0.75 and u(x,0)=0 for the rest? If yes, can we also define these conditions as hard constraint? I could not find any similar example.

forxltk commented 2 years ago
  1. You can pass the reference solution u to IC like the Beltrami_flow example.
  2. Or you can sample the initial points manually as anchor points. Then use Dirichlet BC to define your conditions.
  3. I haven't had a good idea for your hard constraint.
AlirezaPa commented 2 years ago

Hello @forxltk and @979736316 . The following is some portion of "Lorenz system" code of Dr. Lu. It works well, and three variables of C are updated as it is shown:

callbacks for storing results

variable = dde.callbacks.VariableValue([C1, C2, C3], period=1000") model.train(epochs=10000, callbacks=[variable])

image

But the output file just shows the number of iterations and shows "nan" for the variables.

callbacks for storing results

fnamevar = "variables.dat" variable = dde.callbacks.VariableValue([C1, C2, C3], period=1000, filename=fnamevar)

image

I need the output file because the variables in the screen just show two numbers after decimal points such as 7.62e+00, but I need more numbers after decimal points. Or, if there is any way to show the variables with more digits in the screen, it also works for me. Do you know how to fix this issue? Thank you.

forxltk commented 2 years ago

@AlirezaPa #633

AlirezaPa commented 2 years ago

Thank you so much @forxltk

ghost commented 1 year ago

Dear Lu: Sorry to disturb you. I try to apply the same method in Pytorch, but the loss do not descrease, do I make some mistakes? The following is my code.

`data = dde.data.PDE( geometry=geomtime, pde=Primitive_Equations, bcs=[observe_temp, observe_sal, observe_w, observe_v1, observe_v2],

bcs=[observe_v1, observe_v2],

num_domain=500,

)

Neural Network setup

layer_size = [4] + [128] * 6 + [12] activation = "tanh" initializer = "Glorot uniform" net = dde.nn.pytorch.FNN(layer_size, activation, initializer)

def modify_output(x, y): r = x[:, 0:1] theta = x[:, 1:2] phi = x[:, 2:3] t = x[:, 3:4]

r_theta_phi_t = torch.cat([r, theta, phi, t], dim=1)
theta_phi = torch.cat([theta, phi], dim=1)

NN_argo = nn.Sequential(
    nn.Linear(4, 128),
    nn.Tanh(),
    nn.Linear(128, 128),
    nn.Tanh(),
    nn.Linear(128, 2)
)
NN_argo
NN_argo_output = NN_argo(r_theta_phi_t)

NN_cur = nn.Sequential(
    nn.Linear(4, 128),
    nn.Tanh(),
    nn.Linear(128, 128),
    nn.Tanh(),
    nn.Linear(128, 128),
    nn.Tanh(),
    nn.Linear(128, 3)
)
NN_cur_output = NN_cur(r_theta_phi_t)

NN_pres = nn.Sequential(
    nn.Linear(4, 128),
    nn.Tanh(),
    nn.Linear(128, 128),
    nn.Tanh(),
    nn.Linear(128, 1)
)
NN_pres_output = NN_pres(r_theta_phi_t)

NN_para = nn.Sequential(
    nn.Linear(2, 128),
    nn.Tanh(),
    nn.Linear(128, 6)
)
NN_para_output = NN_para(theta_phi)

# final_output = torch.cat([NN_argo_output, NN_cur_output, NN_pres_output, NN_para_output], dim=1)
final_output = torch.cat([NN_argo(r_theta_phi_t), NN_cur(r_theta_phi_t), NN_pres(r_theta_phi_t), NN_para(theta_phi)], dim=1)

return final_output

net.apply_output_transform(modify_output)

model = dde.Model(data, net)`

hannanmustajab commented 10 months ago

@LeroyXiong Hey, did you manage to make it work ?

hannanmustajab commented 9 months ago

@lululxvi @AlirezaPa

Hey guys, I have been working on the 2D wave equation (2 Spatial + 1 Time). To get started, I took the basic case of homogenous velocity model with velocity 1. Forward model works pretty well with hard bc and ic. However, when I tried to provide it some data from fem at three time steps, let's say at t=0, t=1 and t=2s. it does not seem to work well. I took 7000 data points, at these three time steps. I did this because when I was trying to estimate C, it was not anywhere near the value. I thought I should first try to just add this data to my model, and see how it performs. In theory, it should have given better results when I provide it with some labeled training data.

From some debugging and digging into it, I think it might be because of the scale. As an example, one value in my PointSetBC dataset would be like x=2,y=2,t=2 and the corresponding Y value would be something like 7e-33, some values would be in scale of 2e-1 as well. Do you think this might be the problem, and if yes, then can you please provide me with some feedback on how I can fix this. Because if I am not able to do this, performing inversion is out of question. Looking forward to your help. Best, Hannan