zxw4688 commented 3 years ago

Hello @lululxvi I revised the example diffusion_1D_inverse.py in Deepxde, so that it can solve the inverse problem of KDV equation. The KDV equation is : My code can run, however, it can not obtain the right value of B, and both A and C are right. My code is : `# -- coding: utf-8 -- """ Created on Mon Aug 24 00:12:44 2020

@author: Administrator """ from future import absolute_import from future import division from future import print_function

import numpy as np

import deepxde as dde from deepxde.backend import tf

def main(): A = tf.Variable(0.6) B = tf.Variable(0.2) C = tf.Variable(0.8)

def pde(x, y):
    dy_x = tf.gradients(y, x)[0]
    dy_x, dy_t = dy_x[:, 0:1], dy_x[:, 1:]
    dy_xx = tf.gradients(dy_x, x)[0][:, 0:1]
    dy_xxx = tf.gradients(dy_xx, x)[0][:, 0:1]
    return (
        dy_t + A*y*dy_x +B*dy_xxx
        +C*(-0.5*tf.sin(2*x[:, 0:1])*(tf.sin(x[:, 1:]))**2
        +0.0025*tf.cos(x[:, 0:1])*tf.sin(x[:, 1:])
        -tf.sin(x[:, 0:1])*tf.cos(x[:, 1:]))
    )
def func(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])

geom = dde.geometry.Interval(0, 1)
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, func, lambda _, on_boundary: on_boundary)
ic = dde.IC(geomtime, func, lambda _, on_initial: on_initial)

observe_x = np.vstack((np.linspace(0, 1, num=21), np.full((21), 1))).T
ptset = dde.bc.PointSet(observe_x)
observe_y = dde.DirichletBC(
    geomtime, ptset.values_to_func(func(observe_x)), lambda x, _: ptset.inside(x)
)

data = dde.data.TimePDE(
    geomtime,
    pde,
    [bc, ic, observe_y],
    num_domain=400,
    num_boundary=20,
    num_initial=10,
    anchors=observe_x,
    solution=func,
    num_test=10000,
)

layer_size = [2] + [32] * 3 + [1]
activation = "tanh"
initializer = "Glorot uniform"
net = dde.maps.FNN(layer_size, activation, initializer)

model = dde.Model(data, net)

model.compile("adam", lr=0.001, metrics=["l2 relative error"])
variable = dde.callbacks.VariableValue([A,B,C], period=1000)
losshistory, train_state = model.train(epochs=50000, callbacks=[variable])

dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if name == "main": main() `

zxw4688 commented 3 years ago

I have try many times by tuning the intial value of A,B and C, and the hidden layers and neurons, but it did not make it.

lululxvi commented 3 years ago

There are lots of possible reasons, e.g., is the network trained well? are all loss terms small enough? Is the training points enough? etc. See "Q: I failed to train the network or get the right solution, e.g., the training loss is large."

You may start from simple examples, and obtain some experience.

zxw4688 commented 3 years ago

Thank you very much! From the result of it, the training loss or all loss terms are small, the result as follow:

lululxvi commented 3 years ago

You can try to use more training points. But it is hard to tell whit is wrong without trying. Read the examples and learn from others in "Q: I failed to train the network or get the right solution, e.g., the training loss is large." at FAQ.

zxw4688 commented 3 years ago

Thank you very much! I have tried it by adding more training points, but it failed.

zxw4688 commented 3 years ago

LuLu博士，你好！非常感谢你的Package，我很想学会它。我在用它进行KDV方程的反问题时，我将kdv方程改写为：也就是我现在需要Deepxde以向量的形式学出KDV的系数，我把以前用分量形式代码改为了向量，可是运行出错了，麻烦你帮我看一下。另外我想训练结束了把方程的系数拿出来，我采用了代码C=model.get_VariableValue( )，想得到学习的系数。可是出错了，麻烦LuLu博士帮我看看应该怎么改啊？谢谢！我的具体代码如下： `from future import absolute_import from future import division from future import print_function

import numpy as np

import deepxde as dde from deepxde.backend import tf

def main():

A0 = tf.Variable(0.01)

#A1 = tf.Variable(0.8)
#A2 = tf.Variable(0.01)
#A3 = tf.Variable(0.8)
A = tf.Variable([[0.01,0.8,0.01,0.8]])

def pde(x, y):
    dy_x = tf.gradients(y, x)[0]
    dy_x, dy_t = dy_x[:, 0:1], dy_x[:, 1:]
    dy_xx = tf.gradients(dy_x, x)[0][:, 0:1]
    dy_xxx = tf.gradients(dy_xx, x)[0][:, 0:1]
    return (
        dy_t-tf.matmul(tf.constant([[1,0.5*tf.sin(2*x[:, 0:1])*(tf.sin(x[:, 1:]))**2
        -0.0025*tf.cos(x[:, 0:1])*tf.sin(x[:, 1:])
        +tf.sin(x[:, 0:1])*tf.cos(x[:, 1:]), dy_xxx, y*dy_x]]), tf.transpose(A))

    )
def func(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])

def funcy(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])   #+0.01*np.random.normal(0.0, 1.0,1)

geom = dde.geometry.Interval(0, 1)
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, func, lambda _, on_boundary: on_boundary)
ic = dde.IC(geomtime, func, lambda _, on_initial: on_initial)

observe_x = np.vstack((np.linspace(0, 1, num=21), np.full((21), 1))).T
ptset = dde.bc.PointSet(observe_x)
observe_y = dde.DirichletBC(
    geomtime, ptset.values_to_func(funcy(observe_x)), lambda x, _: ptset.inside(x)
)

data = dde.data.TimePDE(
    geomtime,
    pde,
    [bc, ic, observe_y],
    num_domain=400,
    num_boundary=20,
    num_initial=10,
    anchors=observe_x,
    solution=func,
    num_test=10000,
)

layer_size = [2] + [32] * 3 + [1]
activation = "tanh"
initializer = "Glorot uniform"
net = dde.maps.FNN(layer_size, activation, initializer)

model = dde.Model(data, net)

model.compile("adam", lr=0.001, metrics=["l2 relative error"])
variable = dde.callbacks.VariableValue(A, period=1000)
losshistory, train_state = model.train(epochs=5000, callbacks=[variable])
C=model.get_VariableValue()
print(C)
dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if name == "main": main()`

lululxvi commented 3 years ago

Do not use vector form. Use the original form, i.e.,
```
dy_t - A0 - A1 * f ...
```
Use dde.callbacks.VariableValue to get the variables, see example Lorenz_inverse.py

zxw4688 commented 3 years ago

Hello @lululxvi Thank you very much for your reply. Another question, I would like to enhance the speed of the code running by means of the jit in numba, when I add @jit to the front of the function of def main(), but some error happened. Can you tell me how to do it.

zxw4688 commented 3 years ago

the error is KeyError: "Failed in object mode pipeline (step: inline calls to locally defined closures)\ni code'"

lululxvi commented 3 years ago

The code uses Tensorflow which is in C++. You don't need numba. If it is slow, try GPU.

zxw4688 commented 3 years ago

Thank you very much!, but I install tensorflow in the form of CPU, I did not install GPU, how to do it to enhance the speed. Thank you very muc!

zxw4688 commented 3 years ago

Hello @lululxvi I have used your pakcage Deepxde to learn the coefficients of KDV PDE, and I tune all kind of hyperparameters, but I can not obtain the well result, the coefficient of third derivative is very bad, nearly impossible to equal -0.0025. I would like to know why? And the other paper about PINN can get good result. I want to know what happened? My code is: `from future import absolute_import from future import division from future import print_function

import numpy as np

import deepxde as dde from deepxde.backend import tf

def main(): A0 = tf.Variable(0.01) A1 = tf.Variable(-0.8) A2 = tf.Variable(-0.01) A3 = tf.Variable(-0.8)

def pde(x, y):
    dy_x = tf.gradients(y, x)[0]
    dy_x, dy_t = dy_x[:, 0:1], dy_x[:, 1:]
    dy_xx = tf.gradients(dy_x, x)[0][:, 0:1]
    dy_xxx = tf.gradients(dy_xx, x)[0][:, 0:1]
    return (
        dy_t - A0 -A1*(-0.5*tf.sin(2*x[:, 0:1])*(tf.sin(x[:, 1:]))**2
        +0.0025*tf.cos(x[:, 0:1])*tf.sin(x[:, 1:])
        -tf.sin(x[:, 0:1])*tf.cos(x[:, 1:])) -A2*dy_xxx -A3*y*dy_x 
        )
def func(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])

def funcy(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])   #+0.01*np.random.normal(0.0, 1.0,1)

geom = dde.geometry.Interval(0, 1)
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, func, lambda _, on_boundary: on_boundary)
ic = dde.IC(geomtime, func, lambda _, on_initial: on_initial)

observe_x = np.vstack((np.linspace(0, 1, num=21), np.full((21), 1))).T
ptset = dde.bc.PointSet(observe_x)
observe_y = dde.DirichletBC(
    geomtime, ptset.values_to_func(funcy(observe_x)), lambda x, _: ptset.inside(x)
)

data = dde.data.TimePDE(
    geomtime,
    pde,
    [bc, ic, observe_y],
    num_domain=1000,
    num_boundary=80,
    num_initial=30,
    anchors=observe_x,
    solution=func,
    num_test=10000,
)

layer_size = [2] + [50] * 6 + [1]
activation = "tanh"
initializer = "Glorot uniform"
net = dde.maps.FNN(layer_size, activation, initializer)

model = dde.Model(data, net)

model.compile("adam", lr=0.001, metrics=["l2 relative error"])
#variable = dde.callbacks.VariableValue([A0,A1,A2,A3], period=1000,filename="myvariables.dat")
variable = dde.callbacks.VariableValue([A0,A1,A2,A3], period=1000)
losshistory, train_state = model.train(epochs=50000, callbacks=[variable])
dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if name == "main": main() `

lululxvi commented 3 years ago

Could you try few parameters first? also read https://www.biorxiv.org/content/10.1101/865063v2 to learn some experience, e.g., set up a range.

zxw4688 commented 3 years ago

Hello @lululxvi 我在用Deepxde的过程中，我发现迭代的次数不一定要很多，也就是不一定要到五万次，其实有可能一万次就够了，能否通过给定误差（损失函数值的大小，或是拟合函数与真实值的误差）的大小来决定迭代的次数，怎么设置啊？

lululxvi commented 3 years ago

The number of iterations is a hyperparameter you need to select before training. Of course, you can do early stopping, see dde.callbacks.EarlyStopping in Burgers_RAR.py

zxw4688 commented 3 years ago

Hello @lululxvi In matlab feedforwardnet， Levenberg-Marquardt algorithm is much better than other algorithm. Therefore, I would like to know why you do not use the algorithm, and how to add other algorithm to your package, such as Levenberg-Marquardt algorithm. Thank you very much!

lululxvi commented 3 years ago

The method that works for traditional problems may not work well for neural networks. Gradient descent is almost the only optimization method in deep learning. Of course, you can choose to use other methods, e.g., "L-BFGS-B" used in Burgers.py. You can use any methods in https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize

zxw4688 commented 3 years ago

我是想表达：在MATLAB中，LM算法应用到全连接的Feedforwardnet里面很好，相对二范数的误差很小啊，基本可以达到10^(-7)，怎么你这的使用FNN拟合的函数，还是求解PDE，相对二范数误差，怎么之多达到10^(-3)啊。而且matlab中是LM的确很好啊，我认为你的package-Deepxde对KDV方程系数没有学出来的原因应该就是相对二范数误差太大。怎么解释Matlab中的LM算法，这个现象？另外我看了"L-BFGS-B" used in Burgers.py，可是他是使用了两个算法adam和L-BFGS-B这个啊，这个怎么理解啊，谢谢！

zxw4688 commented 3 years ago

我用的是matlab2014版的，使用的函数为feedforwardnet, net, train 和sim基本就可以实现数据拟合了，拟合的NN函数与真实值的误差基本达到了10^(-6)，而且是仅使用一个隐藏层10个神经元的，训练也就一千次以内，而你的package训练到50000次，也就最多达到10^(-3),这是为什么啊？

lululxvi commented 3 years ago

Here are many things:

DeepXDE by default uses float32, which means the precision is 10^{-6} ~10^{-7}. For the L2 relative error, there is an extra square root, so the precision becomes 10^{-3}~10^{-4}. See FAQ how to use float64.
I don't think the error is from L2 relative error. L2 relative error is just a metric. It does not make sense to say "large L2 relative error induces wrong results." The correct way should be "My training loss is too large, i.e., the optimization algorithm has not found a good solution yet, and thus the result is bad".
I think 10^{-3} L2 relative error for inverse problems actually is very good. Of course, in the mean while, you need to check each loss term. If all loss terms are quite small, e.g., < 10^{-5}, then I think the optimization does a good job. (Of course, the problem itself should be O(1).)
It does not make too much sense to aim for a very small loss, like 10^{-7}. For example, if data has noise, 10^{-7} is totally over-fitting.
As mentioned earlier, in practice, almost GD is only used in deep learning. Although other algorithms like LM works well for some small toy problems, for real large problems, they usually failed. As you can see here https://www.tensorflow.org/api_docs/python/tf/keras/optimizers , in TensorFlow only these optimization algorithms are implemented. Matlab is still a traditional numerical tool, and has lots of other traditional methods.
The Burgers example: we first use Adam to find a roughly good solution, and then use L-BFGS to search a local minimum. See DeepXDE paper page 5 for the reason.
For the comparison with Matlab, you may use the same setup, e.g. network size, data points, in DeepXDE to have a comparison. Larger network does not mean better result. I am also curiously about the comparison, although it is impossible to make the setup exactly the same, because we cannot see Matlab code and don't know what Matlab does.

lululxvi / deepxde

how to tune the hyperparameters to obtain the coefficients of the KDV euqation. #112

A0 = tf.Variable(0.01)