boschresearch / torchphysics

https://boschresearch.github.io/torchphysics/
Apache License 2.0
381 stars 40 forks source link

DeepONet support of multiple outputs #28

Open dialuser opened 1 year ago

dialuser commented 1 year ago

Hi

In my use case, I'd like to output at multiple sensor locations. I wonder if this is supported by Torchphysics' deeponet implementation. The output variable would have dimensions of [T, L], where T corresponds to the times defined in Trunknet coordinates, while L is the number of sensors. Thank you.

A.

TomF98 commented 1 year ago

Hi,

this is possible and there are at least two different options to implement this. The first is to just extend the dimension of the output space. In the example I uploaded the output space is one dimensional, but now we want a higher dimensional space:

# old code:
# F = tp.spaces.R1('f') # output variable
# higher dimensional output:
F = tp.spaces.Rn('f', n=L) # space of dimension L, named f

This should create exactly what you want. But depending on L, the above approach leads to rather large output layers in the Trunk- and Branchnet. Another option would be to add some kind of space variable as an input for the Trunknet, so you still get a 1D output at the end but the output depends on the location and you indirectly include the different sensors. For that, just change the input space:

# old code:
# T = tp.spaces.R1('t') # input variable
# higher dimensional input:
T = tp.spaces.R1('t') * tp.spaces.R2('x') # input variable time and space (here you have to set the fitting dimension)

What will work better for your problem, I don't know. Of course, in both cases you have to supply the training data in a fitting shape. But this should be straightforward from the example.

dialuser commented 1 year ago

@TomF98,

I tested your first option, it worked. But like you said, it has a scalability issue when the size of the output locations is large. For the second option, I assume training is still in single-output mode, but the spatial coordinate embedding would inform the Trunknet of the sensor location, right?

Thanks, A

TomF98 commented 1 year ago

I assume training is still in single-output mode, but the spatial coordinate embedding would inform the Trunknet of the sensor location, right?

Correct, just extend the input space of the Trunknet to include the space variable (sensor location) and it should work. See the second code snippet in my previous answer. In the best case this approach could even lead to some kind of interpolation between your sensor locations. But this depends on the type of the functions you want to learn and the available number of sensors.

dialuser commented 1 year ago

Hi @TomF98

Your last comment is really intriguing because that's what I ultimately would like to do, i.e., interpolating to ungauged locations. But the question is how to model the inter-sensor node relations using deeponet. Is it possible to do some sort of graph convolutional network thing within the torchphysics framework? Thanks.

the best case this approach could even lead to some kind of interpolation between your sensor locations. But this depends on the type of the functions you want to learn and the available number of sensors.

TomF98 commented 1 year ago

Your last comment is really intriguing because that's what I ultimately would like to do, i.e., interpolating to ungauged locations. But the question is how to model the inter-sensor node relations using deeponet. Is it possible to do some sort of graph convolutional network thing within the torchphysics framework? Thanks.

Generally, arbitrary network structures can be used in TorchPhysics. You would just have to create a subclass of the Trunk- or Branchnet class. Especially, all neural networks that a possible in PyTorch should be easy to implement. But what works best in your case, heavily depends on your dataset and functions.

I would first keep it simple, try out the method mentioned above (using the space variable as an input in the Trunknet) and see how the DeepONet behaves. If a lot of sensor locations are available this may already lead to a good interpolation. If your underlying differential equation is known, I would, in a next step, include the physics in the training. The physics would be trained on the whole domain, not only on the sensors. This should propagate the knowledge at the sensor locations and in the best case lead to the desired interpolation.

dialuser commented 1 year ago

Hi @TomF98,

I made this space+time work, but in the course of doing that I discovered the trained deeponet had a severe overfitting problem. Although this is commonly related to FNN, I wonder if you have specific advices in the context of deeponet training.

A.

TomF98 commented 1 year ago

Hi @dialuser,

great to hear that the space+time approach works in general. I assume that you already know about the general techniques to prevent overfitting (like dropout, early stopping, data augmentation, ...). But if I understand your problem description correctly, your known data locations may just be too sparse to get a good global representation? Even if ones prevents the overfitting. In this case one could to try to add a physics loss, if your PDE is known, like I mentioned in a previous answer. The physics could then be used inside the whole domain, and should help to extrapolate the known sensor location data. For PINNs we have this example, where this is applied to a rather simple case.

Similarly, this can be implemented for DeepONets. But the current PIDeepONetCondition only works for one single DeepONet and needs to be slightly modified to work like in the above PINN-example. For this one needs to create a new class that inherits fromPIDeepONetCondition, gets two networks (networks for solution and unknown data function) as an input and evaluates both in the forward method. If you need support for this implementation, just let me know.

dialuser commented 1 year ago

Hi @TomF98,

I discovered there's a data processing error, which caused misalignment in my data. Now everything looks much better. Thank you.

A [Edit]: Well I celebrated too early. I have two test datasets: test set 1 includes the same sensor locations but different times from training, while test set 2 includes different sensor locations. When I wrote "much better" in the above, it's on test set 1. I found out later that test set 2 performed pretty bad. This is like you suggested--my sensor locations are too sparse. Unfortunately, I don't have PDE for this problem.