Forward problem for lorenz ode

xizhongchen commented 4 years ago

Hi, Lu, thanks for the great work. I try to learn the examples and the code. To start, I modified the lorenz_inverse.py to be forward prolem. However, I cannot get the correct predictions for this simple ode system. I searched your answers to the issues here and tried serveral things, i.e. scale the solution, modified the loss_weight, increase the epochs. Still it is not working. Could you please have a look and tell me how to solve? Many thanks.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import deepxde as dde

def gen_traindata():  #get the exp data on t and y1,y2,y3 xzchen
    data = np.load("dataset/Lorenz.npz")#because inverse problem xzchen
    return data["t"], data["y"]

def main():
    C1 = 10. #constant to be estimated 
    C2 = 15.
    C3 = 8./3.

    def Lorenz_system(x, y): #pde system xzchen
        """Lorenz system.
        dy1/dx = 10 * (y2 - y1)
        dy2/dx = y1 * (15- y3) - y2
        dy3/dx = y1 * y2 - 8/3 * y3
        """
        y1, y2, y3 = y[:, 0:1], y[:, 1:2], y[:, 2:]
        dy1_x = tf.gradients(y1, x)[0]
        dy2_x = tf.gradients(y2, x)[0]
        dy3_x = tf.gradients(y3, x)[0]
        return [
            dy1_x - C1 * (y2 - y1),
            dy2_x - y1 * (C2 - y3) + y2,
            dy3_x - y1 * y2 + C3 * y3,
        ]

    def boundary(_, on_initial):
        return on_initial

    geom = dde.geometry.TimeDomain(0, 3) #time start from 0 to 3 

    # Initial conditions
    ic1 = dde.IC(geom, lambda X: -8 * np.ones(X.shape), boundary, component=0)
    ic2 = dde.IC(geom, lambda X: 7 * np.ones(X.shape), boundary, component=1)
    ic3 = dde.IC(geom, lambda X: 27 * np.ones(X.shape), boundary, component=2)

    # Get the train data
    observe_t, ob_y = gen_traindata()

    data = dde.data.PDE(
        geom,
        Lorenz_system,
        [ic1, ic2, ic3],
        num_domain=400,
        num_boundary=10,
        num_test=100,
    )

    net = dde.maps.FNN([1] + [40] * 3 + [3], "tanh", "Glorot uniform")
    #net.outputs_modify(lambda x, y: y * 10) #scale y  

    model = dde.Model(data, net)
    model.compile("adam", lr=0.001)
    #model.compile("adam", lr=0.001,  loss_weights=[1e-1,1e-2,1e-1,1, 1, 1e-1])

    losshistory, train_state = model.train(epochs=60000)

    X = geom.uniform_points(100)
    y_pred = model.predict(X)
    #y_dim = y_pred.shape[1]
    S_pred, I_pred, R_pred = y_pred[:, 0:1], y_pred[:, 1:2], y_pred[:, 2:]
    S_ob, I_ob, R_ob = ob_y[:, 0:1], ob_y[:, 1:2], ob_y[:, 2:] 
    plt.figure()
    plt.plot(X, S_pred)
    plt.plot(observe_t, S_ob, "o")    
    plt.plot(X, I_pred)
    plt.plot(X, R_pred)        
    plt.show()
    dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if __name__ == "__main__":
    main()

lululxvi commented 4 years ago

Could you try a smaller time, e.g., from 0 to 1?

xizhongchen commented 4 years ago

It works for smaller time, (0 to 1) ! Why and how to make it work for 0~3?

lululxvi commented 4 years ago

The training sometimes could be tricky. When the time domain is small, it is easy to train. But the training becomes hard for long time prediction, and hyperparameters have to be carefully tuned.

On the other hand, for inverse problems, because we have extra points for the solution, which will make the training easier, although we have unknown coefficients. That is why in this case the inverse problem is easier to solve than the forward problem.

xizhongchen commented 4 years ago

Thanks for the reply. Do you have any suggestions for tuning the hyperparameters. In fact, I also fail in predicting another ode system and thus turn to see how it works for the simple lorenz system. Is there a method that make time dimensionless and scale back later? Or seperate it into serveral small time interverals, I mean predict later time based on the previous time like the time advance Euler scheme?

lululxvi commented 4 years ago

I don't have a rule for the hyperparameter tuning. Each case is different. Besides the common hyperparameters, other things you can consider:

Loss weight of each loss term. If there are four losses in total (3 PDE losses, one IC loss), you can use the following code to specify that the total loss is a weighted summation of each loss, and the weight is [1, 1, 1, 100]
```
model.compile(..., loss_weights=[1, 1, 1, 100])
```
Training points distribution. Try the residual-based adaptive refinement in the paper.

xizhongchen commented 4 years ago

I did try to adjust the loss weight as you can see in the above code. As you mentioned in one of the issues thread that good to check the train loss at step 0. It is good to scale the loss to be at the same order? Any reason why you give 100 for IC here? I will try the residual-based adaptive refinemen

lululxvi commented 4 years ago

I just use 100 as an example. Usually the IC/BC requires a larger weight than PDE.

JuntaoHuang commented 3 years ago

I just use 100 as an example. Usually the IC/BC requires a larger weight than PDE.

I also notice this phenomenon numerically. Is there any theoretical explanation for this phenomenon?

lululxvi commented 3 years ago

See https://arxiv.org/abs/2001.04536

g-w1 commented 1 year ago

@xizhongchen you could try to create a causal structure to the NN as shown in https://arxiv.org/pdf/2203.07404.pdf. In that paper they solve the lorenz attractor.

Or even just solving the interval from [0,1] and then expanding the domain gradually may work also.

lululxvi / deepxde

Forward problem for lorenz ode #41