lu-group / deeponet-fno

DeepONet & FNO (with practical extensions)
Other
233 stars 49 forks source link

Unable to run Burgers equation code with PyTorch #4

Open ShikharShivraj opened 1 year ago

ShikharShivraj commented 1 year ago

Hello Prof. @lululxvi

Thank you for putting together DeepXDE. I was trying to run the Burgers code with a PyTorch backend however got a variety of errors. The first one was in the following lines

def periodic(x):
    x *= 2 * np.pi
    return tf.concat(
        [tf.math.cos(x), tf.math.sin(x), tf.math.cos(2 * x), tf.math.sin(2 * x)], 1
    )

tf was not getting imported properly so I used torch.cat instead. I then got Leaf Variable was used in in-place operation for the line x *= 2 * np.pi. So I fixed that.

Then I got a datatype mismatch error while training the network, so I changed all float32 in code to float64 and also set the default float to float64 -

dde.config.set_default_float("float64")

I then got -

mat1 and mat2 shapes cannot be multiplied (128x4 and 1x128)

So likely, a "transpose" has to be taken in some layer and is missing but I cannot figure out where exactly. Could you please help me with this so I can run using a PyTorch backend.

Additionally, Here is my version info-

image

lululxvi commented 1 year ago

https://github.com/lu-group/deeponet-fno/blob/2bd6b79af99b2178e9044a4aaf30b376ac76c2bd/src/burgers/deeponet.py#L49

You need to modify [1, 128, 128, 128] to [4, 128, 128, 128] for torch.

ShikharShivraj commented 1 year ago

Thanks for your response Prof.!

I did as instructed and got a different error -

Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

I tried changing the code in the source _tensor.py but got the same error. Additionally, I also tried running the code with -

with torch.no_grad() and got an error - element 0 of tensors does not require grad and does not have a grad_fn

lululxvi commented 1 year ago

Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

You misuse numpy array and pytorch tensor somewhere.

rmojgani commented 10 months ago

@ShikharShivraj have you figured it out ? i think the deepxde.backend thing is pulling the wrong functions, e.g.,

from deepxde.backend import tf

if torch is used as the backend, cannot call tf.math.sin so I have changed those to torch.cos still , and after https://github.com/lu-group/deeponet-fno/issues/4#issuecomment-1436034650, I get

Traceback (most recent call last):
  File "/home/exouser/deeponet-fno/src/burgers/deeponet.py", line 75, in <module>
    main()
  File "/home/exouser/deeponet-fno/src/burgers/deeponet.py", line 71, in main
    train(model, lr, epochs)
  File "/home/exouser/deeponet-fno/src/burgers/deeponet.py", line 44, in train
    losshistory, train_state = model.train(epochs=epochs, batch_size=None)
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/utils/internal.py", line 22, in wrapper
    result = f(*args, **kwargs)
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/model.py", line 631, in train
    self._test()
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/model.py", line 820, in _test
    ) = self._outputs_losses(
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/model.py", line 541, in _outputs_losses
    outs = outputs_losses(inputs, targets, auxiliary_vars)
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/model.py", line 316, in outputs_losses_train
    return outputs_losses(
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/model.py", line 300, in outputs_losses
    outputs_ = self.net(inputs)
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/nn/pytorch/deeponet.py", line 115, in forward
    x_func = self.branch(x_func)
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/exouser/.local/lib/python3.9/site-packages/deepxde/nn/pytorch/fnn.py", line 43, in forward
    else self.activation(linear(x))
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/exouser/.local/lib/python3.9/site-packages/torch/utils/_device.py", line 77, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Double
lululxvi commented 9 months ago

@rmojgani Make sure all your data is float32.