lululxvi / deepxde

A library for scientific machine learning and physics-informed learning
https://deepxde.readthedocs.io
GNU Lesser General Public License v2.1
2.51k stars 718 forks source link

how to tune the hyperparameters to obtain the coefficients of the KDV euqation. #112

Closed zxw4688 closed 3 years ago

zxw4688 commented 3 years ago

Hello @lululxvi I revised the example diffusion_1D_inverse.py in Deepxde, so that it can solve the inverse problem of KDV equation. The KDV equation is : image My code can run, however, it can not obtain the right value of B, and both A and C are right. My code is : `# -- coding: utf-8 -- """ Created on Mon Aug 24 00:12:44 2020

@author: Administrator """ from future import absolute_import from future import division from future import print_function

import numpy as np

import deepxde as dde from deepxde.backend import tf

def main(): A = tf.Variable(0.6) B = tf.Variable(0.2) C = tf.Variable(0.8)

def pde(x, y):
    dy_x = tf.gradients(y, x)[0]
    dy_x, dy_t = dy_x[:, 0:1], dy_x[:, 1:]
    dy_xx = tf.gradients(dy_x, x)[0][:, 0:1]
    dy_xxx = tf.gradients(dy_xx, x)[0][:, 0:1]
    return (
        dy_t + A*y*dy_x +B*dy_xxx
        +C*(-0.5*tf.sin(2*x[:, 0:1])*(tf.sin(x[:, 1:]))**2
        +0.0025*tf.cos(x[:, 0:1])*tf.sin(x[:, 1:])
        -tf.sin(x[:, 0:1])*tf.cos(x[:, 1:]))
    )
def func(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])

geom = dde.geometry.Interval(0, 1)
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, func, lambda _, on_boundary: on_boundary)
ic = dde.IC(geomtime, func, lambda _, on_initial: on_initial)

observe_x = np.vstack((np.linspace(0, 1, num=21), np.full((21), 1))).T
ptset = dde.bc.PointSet(observe_x)
observe_y = dde.DirichletBC(
    geomtime, ptset.values_to_func(func(observe_x)), lambda x, _: ptset.inside(x)
)

data = dde.data.TimePDE(
    geomtime,
    pde,
    [bc, ic, observe_y],
    num_domain=400,
    num_boundary=20,
    num_initial=10,
    anchors=observe_x,
    solution=func,
    num_test=10000,
)

layer_size = [2] + [32] * 3 + [1]
activation = "tanh"
initializer = "Glorot uniform"
net = dde.maps.FNN(layer_size, activation, initializer)

model = dde.Model(data, net)

model.compile("adam", lr=0.001, metrics=["l2 relative error"])
variable = dde.callbacks.VariableValue([A,B,C], period=1000)
losshistory, train_state = model.train(epochs=50000, callbacks=[variable])

dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if name == "main": main() `

zxw4688 commented 3 years ago

I have try many times by tuning the intial value of A,B and C, and the hidden layers and neurons, but it did not make it.

lululxvi commented 3 years ago

There are lots of possible reasons, e.g., is the network trained well? are all loss terms small enough? Is the training points enough? etc. See "Q: I failed to train the network or get the right solution, e.g., the training loss is large."

You may start from simple examples, and obtain some experience.

zxw4688 commented 3 years ago

Thank you very much! From the result of it, the training loss or all loss terms are small, the result as follow: image

lululxvi commented 3 years ago

You can try to use more training points. But it is hard to tell whit is wrong without trying. Read the examples and learn from others in "Q: I failed to train the network or get the right solution, e.g., the training loss is large." at FAQ.

zxw4688 commented 3 years ago

Thank you very much! I have tried it by adding more training points, but it failed.

zxw4688 commented 3 years ago

LuLu博士,你好!非常感谢你的Package,我很想学会它。 我在用它进行KDV方程的反问题时,我将kdv方程改写为: image 也就是我现在需要Deepxde以向量的形式学出KDV的系数,我把以前用分量形式代码改为了向量,可是运行出错了,麻烦你帮我看一下。另外我想训练结束了把方程的系数拿出来,我采用了代码C=model.get_VariableValue( ),想得到学习的系数。可是出错了,麻烦LuLu博士帮我看看应该怎么改啊?谢谢!我的具体代码如下: `from future import absolute_import from future import division from future import print_function

import numpy as np

import deepxde as dde from deepxde.backend import tf

def main():

A0 = tf.Variable(0.01)

#A1 = tf.Variable(0.8)
#A2 = tf.Variable(0.01)
#A3 = tf.Variable(0.8)
A = tf.Variable([[0.01,0.8,0.01,0.8]])

def pde(x, y):
    dy_x = tf.gradients(y, x)[0]
    dy_x, dy_t = dy_x[:, 0:1], dy_x[:, 1:]
    dy_xx = tf.gradients(dy_x, x)[0][:, 0:1]
    dy_xxx = tf.gradients(dy_xx, x)[0][:, 0:1]
    return (
        dy_t-tf.matmul(tf.constant([[1,0.5*tf.sin(2*x[:, 0:1])*(tf.sin(x[:, 1:]))**2
        -0.0025*tf.cos(x[:, 0:1])*tf.sin(x[:, 1:])
        +tf.sin(x[:, 0:1])*tf.cos(x[:, 1:]), dy_xxx, y*dy_x]]), tf.transpose(A))

    )
def func(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])

def funcy(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])   #+0.01*np.random.normal(0.0, 1.0,1)

geom = dde.geometry.Interval(0, 1)
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, func, lambda _, on_boundary: on_boundary)
ic = dde.IC(geomtime, func, lambda _, on_initial: on_initial)

observe_x = np.vstack((np.linspace(0, 1, num=21), np.full((21), 1))).T
ptset = dde.bc.PointSet(observe_x)
observe_y = dde.DirichletBC(
    geomtime, ptset.values_to_func(funcy(observe_x)), lambda x, _: ptset.inside(x)
)

data = dde.data.TimePDE(
    geomtime,
    pde,
    [bc, ic, observe_y],
    num_domain=400,
    num_boundary=20,
    num_initial=10,
    anchors=observe_x,
    solution=func,
    num_test=10000,
)

layer_size = [2] + [32] * 3 + [1]
activation = "tanh"
initializer = "Glorot uniform"
net = dde.maps.FNN(layer_size, activation, initializer)

model = dde.Model(data, net)

model.compile("adam", lr=0.001, metrics=["l2 relative error"])
variable = dde.callbacks.VariableValue(A, period=1000)
losshistory, train_state = model.train(epochs=5000, callbacks=[variable])
C=model.get_VariableValue()
print(C)
dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if name == "main": main()`

lululxvi commented 3 years ago
zxw4688 commented 3 years ago

Hello @lululxvi Thank you very much for your reply. Another question, I would like to enhance the speed of the code running by means of the jit in numba, when I add @jit to the front of the function of def main(), but some error happened. Can you tell me how to do it.

zxw4688 commented 3 years ago

the error is KeyError: "Failed in object mode pipeline (step: inline calls to locally defined closures)\ni code'"

lululxvi commented 3 years ago

The code uses Tensorflow which is in C++. You don't need numba. If it is slow, try GPU.

zxw4688 commented 3 years ago

Thank you very much!, but I install tensorflow in the form of CPU, I did not install GPU, how to do it to enhance the speed. Thank you very muc!

zxw4688 commented 3 years ago

Hello @lululxvi I have used your pakcage Deepxde to learn the coefficients of KDV PDE, and I tune all kind of hyperparameters, but I can not obtain the well result, the coefficient of third derivative is very bad, nearly impossible to equal -0.0025. I would like to know why? And the other paper about PINN can get good result. I want to know what happened? My code is: `from future import absolute_import from future import division from future import print_function

import numpy as np

import deepxde as dde from deepxde.backend import tf

def main(): A0 = tf.Variable(0.01) A1 = tf.Variable(-0.8) A2 = tf.Variable(-0.01) A3 = tf.Variable(-0.8)

def pde(x, y):
    dy_x = tf.gradients(y, x)[0]
    dy_x, dy_t = dy_x[:, 0:1], dy_x[:, 1:]
    dy_xx = tf.gradients(dy_x, x)[0][:, 0:1]
    dy_xxx = tf.gradients(dy_xx, x)[0][:, 0:1]
    return (
        dy_t - A0 -A1*(-0.5*tf.sin(2*x[:, 0:1])*(tf.sin(x[:, 1:]))**2
        +0.0025*tf.cos(x[:, 0:1])*tf.sin(x[:, 1:])
        -tf.sin(x[:, 0:1])*tf.cos(x[:, 1:])) -A2*dy_xxx -A3*y*dy_x 
        )
def func(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])

def funcy(x):
    return np.sin(x[:, 0:1]) * np.sin(x[:, 1:])   #+0.01*np.random.normal(0.0, 1.0,1)

geom = dde.geometry.Interval(0, 1)
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.DirichletBC(geomtime, func, lambda _, on_boundary: on_boundary)
ic = dde.IC(geomtime, func, lambda _, on_initial: on_initial)

observe_x = np.vstack((np.linspace(0, 1, num=21), np.full((21), 1))).T
ptset = dde.bc.PointSet(observe_x)
observe_y = dde.DirichletBC(
    geomtime, ptset.values_to_func(funcy(observe_x)), lambda x, _: ptset.inside(x)
)

data = dde.data.TimePDE(
    geomtime,
    pde,
    [bc, ic, observe_y],
    num_domain=1000,
    num_boundary=80,
    num_initial=30,
    anchors=observe_x,
    solution=func,
    num_test=10000,
)

layer_size = [2] + [50] * 6 + [1]
activation = "tanh"
initializer = "Glorot uniform"
net = dde.maps.FNN(layer_size, activation, initializer)

model = dde.Model(data, net)

model.compile("adam", lr=0.001, metrics=["l2 relative error"])
#variable = dde.callbacks.VariableValue([A0,A1,A2,A3], period=1000,filename="myvariables.dat")
variable = dde.callbacks.VariableValue([A0,A1,A2,A3], period=1000)
losshistory, train_state = model.train(epochs=50000, callbacks=[variable])
dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if name == "main": main() `

lululxvi commented 3 years ago

Could you try few parameters first? also read https://www.biorxiv.org/content/10.1101/865063v2 to learn some experience, e.g., set up a range.

zxw4688 commented 3 years ago

Hello @lululxvi 我在用Deepxde的过程中,我发现迭代的次数不一定要很多,也就是不一定要到五万次,其实有可能一万次就够了,能否通过给定误差(损失函数值的大小,或是拟合函数与真实值的误差)的大小来决定迭代的次数,怎么设置啊?

lululxvi commented 3 years ago

The number of iterations is a hyperparameter you need to select before training. Of course, you can do early stopping, see dde.callbacks.EarlyStopping in Burgers_RAR.py

zxw4688 commented 3 years ago

Hello @lululxvi In matlab feedforwardnet, Levenberg-Marquardt algorithm is much better than other algorithm. Therefore, I would like to know why you do not use the algorithm, and how to add other algorithm to your package, such as Levenberg-Marquardt algorithm. Thank you very much!

lululxvi commented 3 years ago

The method that works for traditional problems may not work well for neural networks. Gradient descent is almost the only optimization method in deep learning. Of course, you can choose to use other methods, e.g., "L-BFGS-B" used in Burgers.py. You can use any methods in https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize

zxw4688 commented 3 years ago

我是想表达:在MATLAB中,LM算法应用到全连接的Feedforwardnet里面很好,相对二范数的误差很小啊,基本可以达到10^(-7),怎么你这的使用FNN拟合的函数,还是求解PDE,相对二范数误差,怎么之多达到10^(-3)啊。而且matlab中是LM的确很好啊,我认为你的package-Deepxde对KDV方程系数没有学出来的原因应该就是相对二范数误差太大。怎么解释Matlab中的LM算法,这个现象? 另外我看了"L-BFGS-B" used in Burgers.py,可是他是使用了两个算法adam和L-BFGS-B这个啊,这个怎么理解啊,谢谢! image

zxw4688 commented 3 years ago

我用的是matlab2014版的,使用的函数为feedforwardnet, net, train 和sim基本就可以实现数据拟合了,拟合的NN函数与真实值的误差基本达到了10^(-6),而且是仅使用一个隐藏层10个神经元的,训练也就一千次以内,而你的package训练到50000次,也就最多达到10^(-3),这是为什么啊?

lululxvi commented 3 years ago

Here are many things: