lululxvi / deepxde

A library for scientific machine learning and physics-informed learning
https://deepxde.readthedocs.io
GNU Lesser General Public License v2.1
2.64k stars 739 forks source link

Adaptive weight #787

Open oneikunikun opened 2 years ago

oneikunikun commented 2 years ago

Hello, everyone. First of all, thank Lulu for providing such a great library! Can DEEPXDE realize the adaptive weight mentioned in this document? -- https://www.sciencedirect.com/science/article/pii/S002199912100663X

Another article by the author -- https://www.sciencedirect.com/science/article/abs/pii/S0045782521002759 In Section 4.3, wave propagation problem seems to have been an educational case of deepxde, but adaptive weights are not used.

lululxvi commented 2 years ago

This is not a priority of DeepXDE, and I don't have enough time to do this. You are welcome to submit a PR.

oneikunikun commented 2 years ago

@lululxvi Why is there such a big difference in the mapping of train and test data in case wave_1d?

np.savetxt("wave_1d1.txt", np.hstack((train_state.X_train, train_state.y_pred_train)))
np.savetxt("wave_1d2.txt", np.hstack((train_state.X_test, train_state.y_pred_test)))

Data wave_1d1.txt is shown in the figure, which is completely inconsistent with the correct result: image Data wave_1d2.txt, as shown in the figure, is the correct result: image

lululxvi commented 2 years ago

I guess it is due to your plotting method.

oneikunikun commented 2 years ago

@lululxvi Hi, Lulu, where is the code analysis of wave_1d? I want to ask what the following code slice means:

  1. 63 lines. I have read the specific problems of the original literature, but I still don't know the role of this step?
    net.apply_feature_transform(lambda x: (x - 0.5) * 2 * np.sqrt(3))
  2. 67 lines. This example is different from other examples in that it outputs the initial loss sequence, and then the function of this statement is?
    loss_weights = 5 / initial_losses

    Looking forward to your reply, thank you! Your complete code is attached below:

    
    """Backend supported: tensorflow.compat.v1

Implementation of the wave propagation example in paper https://arxiv.org/abs/2012.10047. References: https://github.com/PredictiveIntelligenceLab/MultiscalePINNs. """ import deepxde as dde import numpy as np

A = 2 C = 10

def get_initial_loss(model): model.compile("adam", lr=0.001, metrics=["l2 relative error"]) losshistory, train_state = model.train(0) return losshistory.loss_train[0]

def pde(x, y): dy_tt = dde.grad.hessian(y, x, i=1, j=1) dy_xx = dde.grad.hessian(y, x, i=0, j=0) return dy_tt - C * 2 dy_xx

def func(x): x, t = np.split(x, 2, axis=1) return np.sin(np.pi x) np.cos(C np.pi t) + np.sin(A np.pi x) np.cos( A C np.pi t )

geom = dde.geometry.Interval(0, 1) timedomain = dde.geometry.TimeDomain(0, 1) geomtime = dde.geometry.GeometryXTime(geom, timedomain)

bc = dde.icbc.DirichletBC(geomtime, func, lambda _, on_boundary: on_boundary) ic1 = dde.icbc.IC(geomtime, func, lambda , on_initial: on_initial)

do not use dde.NeumannBC here, since normal_derivative does not work with temporal coordinate.

ic2 = dde.icbc.OperatorBC( geomtime, lambda x, y, : dde.grad.jacobian(y, x, i=0, j=1), lambda x, _: np.isclose(x[1], 0), ) data = dde.data.TimePDE( geomtime, pde, [bc, ic_1, ic_2], num_domain=360, num_boundary=360, num_initial=360, solution=func, num_test=10000, )

layer_size = [2] + [100] 3 + [1] activation = "tanh" initializer = "Glorot uniform" net = dde.nn.STMsFFN( layer_size, activation, initializer, sigmas_x=[1], sigmas_t=[1, 10] ) net.apply_feature_transform(lambda x: (x - 0.5) 2 * np.sqrt(3))

model = dde.Model(data, net) initial_losses = get_initial_loss(model) loss_weights = 5 / initial_losses model.compile( "adam", lr=0.001, metrics=["l2 relative error"], loss_weights=loss_weights, decay=("inverse time", 2000, 0.9), ) pde_residual_resampler = dde.callbacks.PDEResidualResampler(period=1) losshistory, train_state = model.train( epochs=10000, callbacks=[pde_residual_resampler], display_every=500 )

dde.saveplot(losshistory, train_state, issave=True, isplot=True)

lululxvi commented 2 years ago

This example is done by @smao-astro . He may answer your questions.

smao-astro commented 2 years ago

Hi @oneikunikun ,

It's one year since I implemented this example, and I have not been following deepxde's updates for long, so I might be unable to help you with any compatible issues.

However, I might give you some hints on the questions:

63 lines. I have read the specific problems of the original literature, but I still don't know the role of this step?

net.apply_feature_transform(lambda x: (x - 0.5) * 2 * np.sqrt(3))

The line above is normalizing the inputs to the network as a network transformation layer. It should has the same effect as the lines below in https://github.com/PredictiveIntelligenceLab/MultiscalePINNs

https://github.com/PredictiveIntelligenceLab/MultiscalePINNs/blob/ba7d6bb8af6cabe348def80bed72110f5f0e3621/wave1D/wave_models_tf.py#L949

https://github.com/PredictiveIntelligenceLab/MultiscalePINNs/blob/ba7d6bb8af6cabe348def80bed72110f5f0e3621/wave1D/wave_models_tf.py#L1036

https://github.com/PredictiveIntelligenceLab/MultiscalePINNs/blob/ba7d6bb8af6cabe348def80bed72110f5f0e3621/wave1D/wave_models_tf.py#L1043

67 lines. This example is different from other examples in that it outputs the initial loss sequence, and then the function of this statement is?

This line of code balances the gradients originating from various loss terms. The weights are proportional to the inverses of the initial loss values. It can be a replacement of the NTK-weighting method from the works that you are referring to.

oneikunikun commented 2 years ago

@smao-astro , Thank you for your timely reply, which is very helpful to me. So the processing method of the original wave_1d case is not completely applied in DEEPXDE? Their practices are different in some places, but the effects are similar. If I want to understand in depth, I still need to look at the practices of the original article code?

smao-astro commented 2 years ago

@smao-astro , Thank you for your timely reply, which is very helpful to me. So the processing method of the original wave_1d case is not completely applied in DEEPXDE? Their practices are different in some places, but the effects are similar. If I want to understand in depth, I still need to look at the practices of the original article code?

Kind of... as far as I can remember, I only implemented Multi-scale Fourier Feature Networks in DeepXDE. I did not contributed to implementing NTK-weighting. There might be some differences on hyper-parameters between my implementation and their implementation.

If you are interested in the NTK-weighting method, you might want to look at their code, which is public.

chenyv118 commented 1 year ago

Hello, everyone. First of all, thank Lulu for providing such a great library! Can DEEPXDE realize the adaptive weight mentioned in this document? -- https://www.sciencedirect.com/science/article/pii/S002199912100663X

Another article by the author -- https://www.sciencedirect.com/science/article/abs/pii/S0045782521002759 In Section 4.3, wave propagation problem seems to have been an educational case of deepxde, but adaptive weights are not used.

Hello to this like-minded brother, did the methods in this paper help you in your work? My topic is to solve the inverse problem using PINN to estimate the unknown parameters in ODEs, weight adjustment is also a very important task in my problem, adjusting different weights will significantly affect the final result, however, I have tried many methods but still can't find the right combination of weights, is it possible to select the right combination of weights efficiently by using the method of NTK? I don't understand this paper you shared and the code in it very well, so I won't continue to spend time on this if it doesn't work well.