Neural Operator Shapes Mismatch

JakobEliasWagner commented 5 months ago

Description:

I've encountered a problem with the NeuralOperator architecture where it fails to produce correctly shaped outputs in cases where $dim(U) \neq dim(V)$ or $dim(Y) \neq dim(V)$. This issue seems to persist across various configurations and has been consistently reproducible in my tests.

Environment:

Branch: cleanup/operator-tests
Specific tests: tests/operators/test_neural_operator.py - test_shapes()

Expected Behavior:

The expected behavior is that the output shapes of the NeuralOperator architecture adjust appropriately to accommodate differences in dimension sizes among $dim(Y)$, $dim(V)$, and $dim(U)$, ensuring that the output dimensions are correct based on the input and internal architecture specifications with DatasetShapes.

Actual Behavior:

When$dim(U) \neq dim(V)$ or $dim(U) \neq dim(V)$, the output shape produced by the NeuralOperator do not match the expected dimensions $dim(V)$. Specifically, an assertion error is raised, failing the test. Disabling this assertion does not yield the expected results either.

@miluchmann I appreciate any guidance or insights on this issue. Thank you for your support!

MLuchmann commented 5 months ago

Hi @JakobEliasWagner,

thanks for pointing out this issue. I had a quick look and it seems like this is a general issue with how we construct the currently implemented operators. I believe it is an issue with all our operators right now. I quickly replaced the IntegralKernel Operator with an DeepONet operator and the same problem appears.

Issue

Each individual operator in the NeuralOperator class is given the argument shapes which is extracted from the dataset via dataset.shapes. However, this shape parameter has to be different for the inner operators compared to the last and first operator . An example (pseudo code):


x, y, u, v = dataset[:]
x.shape = (1, 10, 1)
y.shape = (1, 10, 1)
u.shape = (1, 10, 1)
v.shape = (1, 10, 2)

shapes1 = shapes2 = shapes3 = dataset.shapes
operator1 = IntegralKernel(shapes1)
operator2 = IntegralKernel(shapes2)
operator3 = IntegralKernel(shapes3)

v = operator1(x=x, u=u, x=x) -> shape = (1, 10, 2)
v = operator2(x=x, u=v, x=x) -> will fail because it wants u.shape to be (1, 10, 1) but receives (1, 10, 2)
v = operator3(x=x, u=v, =y)

Proposed solutions:

1: To quickly solve this issue one could initialize each operator with a different shape parameter (so shapes1 != shapes2 != shaped3). However, this seems a bit messy, because we will have to initalize 3 new DatasetShapes objects in the __init__ method of NeuralOperator. One could make this a little bit easier by having a method which creates a DatasetShapes object via something like. GetDatasetShape(x_num, y_num, x_dim, y_dim, u_num, u_dim, v_num, v_dim), because right now we will have to first create TensorShape objects and then pass them to the DatasetShapes class.
2: We rethink how we define the interface of the operators. Instead of accepting a DatasetShapes objets, the operators get u.dim, v.dim, x.dim, y.dim separately. While this would make it easier adjustable, it also creates a lot more arguments. I like the interface right now, because its rather lean.
3: We make use of the lifting and projection operation and then restrict the operators more by only allowing u.dim == v.dim and x.dim == y.dim. In this way we could restrict the interface of the operators and, therefore, reducing the number of arguments. The mismatch between x and y and u and v is then completely solved by the lifiting and projection opertations. I believe this is also how other people solve this issue.
4: We give each operator something like a output_shape parameter and create a method which automatically takes care of the correct shapes for the intermediate layers. I am not completely sure how this could look like.

@JakobEliasWagner, @samuelburbulla What do you think about this? I hope it is clear what I mean.

JakobEliasWagner commented 5 months ago

@MLuchmann thank you for having a look. Sorry that I tagged you in this, I only now noticed that you did not change the NeuralOperator architecture. I agree that binding all operators to the shape of a dataset is too restrictive and misses the point. Especially in contexts where different operators are combined to define a new one. The only thing that really matters for defining operators is the shape of its inputs and outputs, which may not be directly linked to the dataset itself.

aai-institute / continuiti