isds-neu / PhyCRNet

Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs
MIT License
118 stars 35 forks source link

the role of residual connection #2

Open dialuser opened 2 years ago

dialuser commented 2 years ago

Hi Paul,

I have a question regarding the role of residual connection. In PhyCRNet, the temporal derivative is already incorporated in the loss term, why do you still need the residual connection?

I guess each encoder-convlstm-decoder may be considered a Unet. In this sense, the residual connection may function as skip connections right? In my case, the variable u has two components, say, u1 and u2. The PDE is in the form {\partial u1} / {\partial t} = \alpha {\divergence (\nabla (u2)) +...

If I do u = ut + dt*u, it does not seem to work.

Many Thanks.

paulpuren commented 2 years ago

Hi Paul,

I have a question regarding the role of residual connection. In PhyCRNet, the temporal derivative is already incorporated in the loss term, why do you still need the residual connection?

I guess each encoder-convlstm-decoder may be considered a Unet. In this sense, the residual connection may function as skip connections right? In my case, the variable u has two components, say, u1 and u2. The PDE is in the form {\partial u1} / {\partial t} = \alpha {\divergence (\nabla (u2)) +...

If I do u = ut + dt*u, it does not seem to work.

Many Thanks.

Hello,

Thank you for your interests. First of all, the residual connection is inspired from the forward Euler scheme, where the residual is easy to learn for neural networks. Besides, it can also avoid the gradient vanishing in general AI problems (see [our paper] in Section 3.3). The residual connection here is to better help learn an input-output (i.e., ut -> u{t+1}) mapping. The temporal derivative used in the loss term is built for physics loss which constraints the network.

Secondly, the residual connection in our paper acts as the skip-connection, but it is not exactly same as the U-Net structure. U-Net has multi-scale skip-connection and concatenation, and has more network parameters. There are some works investigating on the potential of U-Net for scientific computing (listed below). However, we found it hard to train with scarce data and without any labeled data.

Towards physics-informed deep learning for turbulent flow prediction [Link]

Thirdly, for your PDE case, if you still do u = ut + dt*u, it’s fine because you are just doing a forward propagation in learning the input-output mapping. You can still construct your own loss function to constrain your network.

Overall, the loss function is dominant in the training while the residual connection is just to make your training easier.