`STMsFNN` performs poorly on the 2D wave equation

Saransh-cpp commented 2 years ago

Greetings,

I have been trying to use the Spatio Temporal Multi-scale Fourier feature networks on the 2-dimensional wave equation, but the network performs poorly and is not able to pick up high-frequency details.

Plots

Losses and metric

Plot 1 (y, u, and t)

Plot 2 (x, u, and t)

Code

import numpy as np
import pandas as pd
import deepxde as dde
from deepxde.backend import tf
import matplotlib.pyplot as plt

c = 10  # wave equation constant
C = 4 / np.pi  # Fourier constant
n = 1
m = 1

# dimensions
a = 1
b = 1

def pde(x, u):
    u_tt = dde.grad.hessian(u, x, i=2, j=2)
    u_xx = dde.grad.hessian(u, x, i=0, j=0)
    u_yy = dde.grad.hessian(u, x, i=1, j=1)
    return u_tt - ((c ** 2) * (u_xx + u_yy))

def sol(x):
    return (
        C
        * np.sin(m * np.pi * x[:, 0:1] / a)
        * np.sin(n * np.pi * x[:, 1:2] / b)
        * np.cos(np.pi * c * x[:, 2:3])  # (m^2/a^2  +  n^2/b^2) = 1
    )

def boundary_init(x, _):
    return np.isclose(x[-1], 0)

def get_initial_loss(model):
    model.compile("adam", lr=0.001, metrics=["l2 relative error"])
    losshistory, train_state = model.train(0)
    return losshistory.loss_train[0]

spatial_domain = dde.geometry.Rectangle(xmin=[0, 0], xmax=[a, b])
temporal_domain = dde.geometry.TimeDomain(0, 1)
spatio_temporal_domain = dde.geometry.GeometryXTime(spatial_domain, temporal_domain)

d_bc = dde.DirichletBC(
    spatio_temporal_domain, lambda x: 0, lambda _, on_boundary: on_boundary
)
ic = dde.IC(
    spatio_temporal_domain,
    lambda x: np.sin(n * np.pi * x[:, 0:1] / a),
    lambda _, on_initial: on_initial,
)
ic_2 = dde.OperatorBC(
    spatio_temporal_domain,
    lambda x, u, _: dde.grad.jacobian(u, x, i=0, j=1),
    boundary_init,
)

data = dde.data.TimePDE(
    spatio_temporal_domain,
    pde,
    [d_bc, ic, ic_2],
    num_domain=360,
    num_boundary=360,
    num_initial=360,
    num_test=10000,
    solution=sol,
)

net = dde.nn.STMsFFN(
    [3] + [100] * 2 + [300] * 2 + [100] * 2 + [1],
    "tanh",
    "Glorot uniform",
    sigmas_x=[50, 100, 200],
    sigmas_t=[50, 100, 200],
)

net.apply_feature_transform(lambda x: x * 10)

model = dde.Model(data, net)
initial_losses = get_initial_loss(model)
loss_weights = 1 / (2 * initial_losses)
losshistory, train_state = model.train(0)
model.compile(
    "adam",
    lr=0.001,
    metrics=["l2 relative error"],
    decay=("inverse time", 2000, 0.9),
    loss_weights=loss_weights,
)
pde_residual_resampler = dde.callbacks.PDEResidualResampler(period=1)
losshistory, train_state = model.train(epochs=10000, callbacks=[pde_residual_resampler])

dde.saveplot(
    losshistory,
    train_state,
    issave=True,
    isplot=True,
    test_fname="../dat_data/wave_2D.dat",
)

Output

Initializing variables...
Training model...

Step      Train loss                                  Test loss                                   Test metric   
0         [1.56e+16, 1.66e-02, 5.13e-01, 1.94e+04]    [1.69e+16, 1.66e-02, 5.13e-01, 1.94e+04]    [1.04e+00]    

Best model at step 0:
  train loss: 1.56e+16
  test loss: 1.69e+16
  test metric: [1.04e+00]

'train' took 2.520825 s

Initializing variables...
Training model...

Step      Train loss                                  Test loss                                   Test metric   
0         [1.03e+16, 1.67e-02, 5.13e-01, 2.11e+04]    [1.15e+16, 1.67e-02, 5.13e-01, 2.11e+04]    [1.03e+00]    

Best model at step 0:
  train loss: 1.03e+16
  test loss: 1.15e+16
  test metric: [1.03e+00]

'train' took 0.701051 s

Compiling model...
'compile' took 4.639123 s

Initializing variables...
Training model...

Step      Train loss                                  Test loss                                   Test metric   
0         [5.22e-01, 4.74e-01, 4.94e-01, 7.63e-01]    [5.13e-01, 4.74e-01, 4.94e-01, 7.63e-01]    [1.04e+00]
1000      [1.60e-03, 3.35e-06, 8.20e-06, 8.25e-06]    [4.65e-03, 3.35e-06, 8.20e-06, 8.25e-06]    [9.72e-01]
2000      [5.19e-04, 1.03e-05, 2.95e-05, 3.05e-04]    [2.97e-03, 1.03e-05, 2.95e-05, 3.05e-04]    [9.71e-01]
3000      [2.87e-04, 2.07e-05, 6.73e-06, 1.10e-04]    [2.47e-03, 2.07e-05, 6.73e-06, 1.10e-04]    [9.71e-01]
4000      [1.86e-04, 8.60e-06, 4.09e-06, 5.88e-05]    [2.21e-03, 8.60e-06, 4.09e-06, 5.88e-05]    [9.70e-01]
5000      [1.38e-04, 2.99e-06, 2.24e-06, 5.13e-05]    [2.04e-03, 2.99e-06, 2.24e-06, 5.13e-05]    [9.69e-01]
6000      [1.00e-04, 6.62e-06, 7.22e-06, 1.32e-04]    [1.87e-03, 6.62e-06, 7.22e-06, 1.32e-04]    [9.69e-01]
7000      [7.77e-05, 1.98e-06, 2.22e-06, 5.13e-05]    [1.74e-03, 1.98e-06, 2.22e-06, 5.13e-05]    [9.68e-01]
8000      [6.52e-05, 8.01e-07, 4.67e-07, 9.45e-06]    [1.62e-03, 8.01e-07, 4.67e-07, 9.45e-06]    [9.68e-01]
9000      [4.76e-05, 1.40e-06, 4.03e-06, 5.21e-05]    [1.54e-03, 1.40e-06, 4.03e-06, 5.21e-05]    [9.68e-01]
10000     [3.88e-05, 1.34e-06, 1.10e-06, 2.17e-05]    [1.45e-03, 1.34e-06, 1.10e-06, 2.17e-05]    [9.67e-01]

Best model at step 10000:
  train loss: 6.29e-05
  test loss: 1.47e-03
  test metric: [9.67e-01]

I have tried varying every hyperparameter but I can't get the network to learn the solution accurately. Could someone please help me with this?

Thank you!

lululxvi commented 2 years ago

The initial loss is 1.03e+16, so it seems the scaling has an issue.

Saransh-cpp commented 2 years ago

Thank you for the suggestions, @lululxvi! I made some changes to my code and now it seems to do a bit better. But I still am not able to get the desired output -

import numpy as np
import pandas as pd
import deepxde as dde
from deepxde.backend import tf
import matplotlib.pyplot as plt

c = 10  # wave equation constant
C = 1  # Fourier constant
n = 1
m = 1

# dimensions
a = 1
b = 1

def pde(x, u):
    u_tt = dde.grad.hessian(u, x, i=2, j=2)
    u_xx = dde.grad.hessian(u, x, i=0, j=0)
    u_yy = dde.grad.hessian(u, x, i=1, j=1)
    return u_tt - ((c ** 2) * (u_xx + u_yy))

def sol(x):
    return (
        C
        * np.sin(m * np.pi * x[:, 0:1] / a)
        * np.sin(n * np.pi * x[:, 1:2] / b)
        * np.cos(np.sqrt(2) * np.pi * c * x[:, 2:3])  # (m^2/a^2  +  n^2/b^2) = 2
    )

def boundary_init(x, _):
    return np.isclose(x[-1], 0)

def get_initial_loss(model):
    model.compile("adam", lr=0.001, metrics=["l2 relative error"])
    losshistory, _ = model.train(0)
    return losshistory.loss_train[0]

spatial_domain = dde.geometry.Rectangle(xmin=[0, 0], xmax=[a, b])
temporal_domain = dde.geometry.TimeDomain(0, 1)
spatio_temporal_domain = dde.geometry.GeometryXTime(spatial_domain, temporal_domain)

d_bc = dde.DirichletBC(
    spatio_temporal_domain, lambda x: 0, lambda _, on_boundary: on_boundary
)
ic = dde.IC(
    spatio_temporal_domain,
    lambda x: np.sin(np.pi * x[:, 0:1]) * np.sin(np.pi * x[:, 1:2]),
    lambda _, on_initial: on_initial,
)
ic_2 = dde.OperatorBC(
    spatio_temporal_domain,
    lambda x, u, _: dde.grad.jacobian(u, x, i=0, j=1),
    boundary_init,
)

data = dde.data.TimePDE(
    spatio_temporal_domain,
    pde,
    [d_bc, ic, ic_2],
    num_domain=360,
    num_boundary=360,
    num_initial=360,
    num_test=10000,
    solution=sol,
)

net = dde.nn.STMsFFN(
    [3] + [100] * 3
    # + [300] * 2 + [100] * 2
    + [1],
    "tanh",
    "Glorot uniform",
    # sigmas_x=[100, 250, 500, 1000, 1250],
    # sigmas_t=[100, 250, 500, 1000, 1250, 1500, 2500],
    sigmas_x=[100, 150, 200, 250, 500,],
    sigmas_t=[100, 150, 200, 250, 500, 750, 1000],
)

net.apply_feature_transform(lambda x: x / 250)

model = dde.Model(data, net)
initial_losses = get_initial_loss(model)
loss_weights = 1 / (2 * initial_losses)
model.compile(
    "adam",
    lr=0.001,
    metrics=["l2 relative error"],
    # decay=("inverse time", 5000, 0.9),
    decay=("inverse time", 10000, 0.9),
    loss_weights=loss_weights,
)
pde_residual_resampler = dde.callbacks.PDEResidualResampler(period=1)
losshistory, train_state = model.train(epochs=20000, callbacks=[pde_residual_resampler], display_every=500)

dde.saveplot(
    losshistory,
    train_state,
    issave=True,
    isplot=True,
    test_fname="../dat_data/wave_2D.dat",
)

Initializing variables...
Training model...

Step      Train loss                                  Test loss                                   Test metric   
0         [4.71e+04, 5.06e-02, 3.57e-01, 1.34e-01]    [4.28e+04, 5.06e-02, 3.57e-01, 1.34e-01]    [1.12e+00]    

Best model at step 0:
  train loss: 4.71e+04
  test loss: 4.28e+04
  test metric: [1.12e+00]

'train' took 16.711520 s

Compiling model...
'compile' took 20.653095 s

Initializing variables...
Training model...

Step      Train loss                                  Test loss                                   Test metric   
0         [5.07e-01, 4.95e-01, 5.28e-01, 2.63e-01]    [3.97e-01, 4.95e-01, 5.28e-01, 2.63e-01]    [1.16e+00]    
500       [9.27e-03, 1.40e-02, 9.33e-02, 9.06e-04]    [1.34e-02, 1.40e-02, 9.33e-02, 9.06e-04]    [9.56e-01]    
1000      [8.77e-03, 6.90e-03, 6.58e-02, 1.55e-03]    [2.05e-02, 6.90e-03, 6.58e-02, 1.55e-03]    [9.50e-01]    
1500      [6.40e-03, 5.24e-03, 6.50e-02, 1.23e-03]    [1.53e-02, 5.24e-03, 6.50e-02, 1.23e-03]    [9.45e-01]    
2000      [5.25e-03, 4.68e-03, 6.53e-02, 8.69e-04]    [1.34e-02, 4.68e-03, 6.53e-02, 8.69e-04]    [9.41e-01]    
2500      [3.96e-03, 4.14e-03, 6.54e-02, 9.01e-04]    [1.14e-02, 4.14e-03, 6.54e-02, 9.01e-04]    [9.37e-01]    
3000      [3.17e-03, 4.16e-03, 6.54e-02, 7.84e-04]    [9.79e-03, 4.16e-03, 6.54e-02, 7.84e-04]    [9.35e-01]    
3500      [3.47e-03, 4.01e-03, 6.54e-02, 5.43e-04]    [8.94e-03, 4.01e-03, 6.54e-02, 5.43e-04]    [9.31e-01]    
4000      [3.30e-03, 3.79e-03, 6.46e-02, 1.22e-03]    [9.00e-03, 3.79e-03, 6.46e-02, 1.22e-03]    [9.29e-01]    
4500      [2.34e-03, 3.58e-03, 6.48e-02, 9.94e-04]    [8.36e-03, 3.58e-03, 6.48e-02, 9.94e-04]    [9.28e-01]    
5000      [2.71e-03, 3.59e-03, 6.50e-02, 8.52e-04]    [7.89e-03, 3.59e-03, 6.50e-02, 8.52e-04]    [9.27e-01]    
5500      [2.80e-03, 4.23e-03, 6.55e-02, 4.60e-04]    [7.78e-03, 4.23e-03, 6.55e-02, 4.60e-04]    [9.26e-01]    
6000      [2.95e-03, 3.33e-03, 6.46e-02, 1.12e-03]    [7.54e-03, 3.33e-03, 6.46e-02, 1.12e-03]    [9.25e-01]    
6500      [1.96e-03, 3.34e-03, 6.52e-02, 4.48e-04]    [6.78e-03, 3.34e-03, 6.52e-02, 4.48e-04]    [9.24e-01]    
7000      [1.92e-03, 2.93e-03, 6.49e-02, 8.08e-04]    [6.95e-03, 2.93e-03, 6.49e-02, 8.08e-04]    [9.22e-01]    
7500      [2.02e-03, 2.87e-03, 6.49e-02, 6.66e-04]    [6.71e-03, 2.87e-03, 6.49e-02, 6.66e-04]    [9.22e-01]    
8000      [1.83e-03, 2.66e-03, 6.51e-02, 4.94e-04]    [6.46e-03, 2.66e-03, 6.51e-02, 4.94e-04]    [9.21e-01]    
8500      [1.85e-03, 2.48e-03, 6.51e-02, 6.28e-04]    [5.92e-03, 2.48e-03, 6.51e-02, 6.28e-04]    [9.21e-01]    
9000      [1.81e-03, 2.50e-03, 6.52e-02, 5.48e-04]    [6.03e-03, 2.50e-03, 6.52e-02, 5.48e-04]    [9.19e-01]    
9500      [1.74e-03, 2.47e-03, 6.48e-02, 7.12e-04]    [5.59e-03, 2.47e-03, 6.48e-02, 7.12e-04]    [9.19e-01]    
10000     [1.59e-03, 2.25e-03, 6.51e-02, 5.53e-04]    [5.56e-03, 2.25e-03, 6.51e-02, 5.53e-04]    [9.19e-01]    
10500     [1.89e-03, 2.08e-03, 6.53e-02, 4.49e-04]    [5.96e-03, 2.08e-03, 6.53e-02, 4.49e-04]    [9.17e-01]    
11000     [1.43e-03, 2.08e-03, 6.54e-02, 3.63e-04]    [4.98e-03, 2.08e-03, 6.54e-02, 3.63e-04]    [9.16e-01]    
11500     [1.69e-03, 2.08e-03, 6.51e-02, 4.85e-04]    [4.96e-03, 2.08e-03, 6.51e-02, 4.85e-04]    [9.14e-01]    
12000     [1.78e-03, 2.32e-03, 6.51e-02, 5.95e-04]    [4.97e-03, 2.32e-03, 6.51e-02, 5.95e-04]    [9.13e-01]    
12500     [1.47e-03, 1.92e-03, 6.52e-02, 4.94e-04]    [4.56e-03, 1.92e-03, 6.52e-02, 4.94e-04]    [9.11e-01]    
13000     [1.37e-03, 2.04e-03, 6.43e-02, 1.53e-03]    [4.47e-03, 2.04e-03, 6.43e-02, 1.53e-03]    [9.10e-01]    
13500     [1.37e-03, 1.76e-03, 6.50e-02, 5.60e-04]    [4.51e-03, 1.76e-03, 6.50e-02, 5.60e-04]    [9.09e-01]    
14000     [1.07e-03, 1.87e-03, 6.50e-02, 6.77e-04]    [3.89e-03, 1.87e-03, 6.50e-02, 6.77e-04]    [9.07e-01]    
14500     [1.56e-03, 1.77e-03, 6.48e-02, 7.65e-04]    [3.78e-03, 1.77e-03, 6.48e-02, 7.65e-04]    [9.06e-01]    
15000     [1.33e-03, 1.89e-03, 6.49e-02, 7.50e-04]    [3.65e-03, 1.89e-03, 6.49e-02, 7.50e-04]    [9.05e-01]    
15500     [1.00e-03, 1.64e-03, 6.49e-02, 6.28e-04]    [3.59e-03, 1.64e-03, 6.49e-02, 6.28e-04]    [9.04e-01]    
16000     [1.12e-03, 1.57e-03, 6.48e-02, 7.62e-04]    [3.46e-03, 1.57e-03, 6.48e-02, 7.62e-04]    [9.04e-01]    
16500     [8.43e-04, 1.49e-03, 6.46e-02, 9.35e-04]    [3.89e-03, 1.49e-03, 6.46e-02, 9.35e-04]    [9.02e-01]
17000     [1.27e-03, 1.63e-03, 6.49e-02, 6.92e-04]    [3.28e-03, 1.63e-03, 6.49e-02, 6.92e-04]    [9.01e-01]
17500     [1.22e-03, 1.50e-03, 6.47e-02, 9.15e-04]    [3.20e-03, 1.50e-03, 6.47e-02, 9.15e-04]    [9.00e-01]
18000     [8.37e-04, 1.42e-03, 6.50e-02, 5.74e-04]    [3.12e-03, 1.42e-03, 6.50e-02, 5.74e-04]    [9.00e-01]
18500     [1.25e-03, 1.36e-03, 6.51e-02, 4.74e-04]    [3.25e-03, 1.36e-03, 6.51e-02, 4.74e-04]    [8.99e-01]
19000     [1.08e-03, 1.33e-03, 6.48e-02, 7.79e-04]    [3.04e-03, 1.33e-03, 6.48e-02, 7.79e-04]    [8.98e-01]
19500     [9.68e-04, 1.37e-03, 6.50e-02, 5.11e-04]    [2.81e-03, 1.37e-03, 6.50e-02, 5.11e-04]    [8.98e-01]
20000     [9.88e-04, 1.29e-03, 6.50e-02, 5.75e-04]    [2.71e-03, 1.29e-03, 6.50e-02, 5.75e-04]    [8.98e-01]

Best model at step 18000:
  train loss: 6.78e-02
  test loss: 7.01e-02
  test metric: [9.00e-01]

lululxvi commented 2 years ago

The loss seems still large. You have only trained for 20000 steps.

Have you tried the similar PDE but for a simpler solution (smoother)?

engsbk commented 2 years ago

Is it possible to use STMsFFN without an exact solution?

lululxvi commented 2 years ago

Is it possible to use STMsFFN without an exact solution?

Yes, the same usage as other networks.

hannanmustajab commented 9 months ago

@Saransh-cpp Hey, just wanted to add here. Incase any else faces this issue! In your code above, you didn't achieve desired results because of this line:

ic_2 = dde.OperatorBC(
    spatio_temporal_domain,
    lambda x, u, _: dde.grad.jacobian(u, x, i=0, j=1),
    boundary_init,
)

It should be lambda x, u, _: dde.grad.jacobian(u, x, i=0, j=2),

I was using operatorBC as used above, and spent almost 2 days and 100 experiments, to get stupid results. Since, this is a 2D case, your input vector would be [X,Y,T].

lululxvi / deepxde