dotnet / TorchSharp

A .NET library that provides access to the library that powers PyTorch.
MIT License
1.37k stars 177 forks source link

MultivariateNormal.log_prob() exception in TorchSharp but works in pytorch #1334

Closed Kermalis closed 3 months ago

Kermalis commented 3 months ago

Hello, I have ported over this code from pytorch: https://github.com/nikhilbarhate99/PPO-PyTorch/blob/master/PPO.py

It's just a PPO implementation. However, this line is where I'm seeing a difference in behavior: https://github.com/nikhilbarhate99/PPO-PyTorch/blob/master/PPO.py#L97

I have tested this pytorch code locally with numpy and it works flawlessly. However, this equivalent line in C# is where I get an exception. System.Runtime.InteropServices.ExternalException: 'shape '[1, 3, 1, 1, 3]' is invalid for input of size 3'

The C# equivalent looks like: (Ignore the lack of disposes for now)

public void Act(torch.Tensor inputs, out torch.Tensor actionD, out torch.Tensor actionLogProbD, out torch.Tensor stateValD)
{
    torch.Tensor actionMean = _actor.forward(inputs);
    torch.Tensor covMat = torch.diag(_actionVar).unsqueeze(0);
    var dist = new MultivariateNormal(actionMean, covariance_matrix: covMat);

    torch.Tensor action = dist.sample();
    torch.Tensor actionLogProb = dist.log_prob(action); // Exception occurs when trying to call this
    torch.Tensor stateVal = _critic.forward(inputs);

    actionD = action.detach();
    actionLogProbD = actionLogProb.detach();
    stateValD = stateVal.detach();
}

I have checked each tensor individually and verified that the sizes are identical in C# and python. Below I have printed the actionMean, _actionVar, covMat and action tensors:

Python:

tensor([ 0.2025, -0.0714,  0.1417])
tensor([0.3600, 0.3600, 0.3600])
tensor([[[0.3600, 0.0000, 0.0000],
         [0.0000, 0.3600, 0.0000],
         [0.0000, 0.0000, 0.3600]]])
tensor([[ 0.4416, -0.7422,  0.0073]])

C#:

[-0.019071, -0.026232, -0.074312]
[0.36, 0.36, 0.36]
[[[0.36, 0, 0]
  [0, 0.36, 0]
  [0, 0, 0.36]]]
[[-0.065244, 0.017988, -0.77489]]

Simply put, they are [3], [3], [1x3x3], and [1x3] respectively. I'm not sure why the error happens in TorchSharp and not pytorch unless it's just a bug. I don't understand the nonsensical dimensions the exception is talking about. I went over this all day and even talked with some LLMs and they're convinced the code is ported properly (and I am too at this point since I have verified the tensor sizes myself). Unfortunately I can only talk to LLMs since I don't know anyone in this field and I have no official education. Any help is appreciated here, I'd like to get this up and running in C#

NiklasGustafsson commented 3 months ago

Huh!

That's an exception coming from deep inside libtorch, and those are often "nonsensical," since they report an error in the context of the function that is called in the implementation, not at the highest-level API. I'll take a look at MultivariateNormal, but it would help if your code could be pared down to a small test with just the relevant tensors. Since this is a shape-related exception, what's inside the tensors probably doesn't matter, so random values are fine.

I'll try to do that if I don't find anything obvious, but if you have time to narrow it down and post the smaller code, that would speed things up.

NiklasGustafsson commented 3 months ago

In the meantime, you may be able to work around it by using another continuous distribution. It'll produce different results, no doubt, but not invalidate the model completely, I don't think.

yueyinqiu commented 3 months ago

I've written one, but I've got a different exception (System.Runtime.InteropServices.ExternalException:“shape '[3, 3]' is invalid for input of size 3”):

using TorchSharp.Modules;
using TorchSharp;

var actionMean = torch.tensor((double[])[0.2025, -0.0714, 0.1417]);
var covMat = torch.tensor(new double[,]
{
    { 0.36, 0, 0 },
    { 0, 0.36, 0 },
    { 0, 0, 0.36 },
});
var dist = new MultivariateNormal(actionMean, covariance_matrix: covMat);
torch.Tensor action = dist.sample();
torch.Tensor actionLogProb = dist.log_prob(action);

I would also try this in PyTorch later.


Yes, it works in PyTorch:

import torch

actionMean = torch.tensor([-0.019071, -0.026232, -0.074312])
covMat = torch.tensor([[0.36, 0, 0], [0, 0.36, 0], [0, 0, 0.36]])
dist = torch.distributions.MultivariateNormal(actionMean, covariance_matrix=covMat)
action = dist.sample()
actionLogProb = dist.log_prob(action)

print(actionLogProb)  # tensor(-1.5809)
NiklasGustafsson commented 3 months ago

@yueyinqiu, your covMat is missing a 1-sized dimension, I believe.

yueyinqiu commented 3 months ago

It's line 221:

image

( In Pytorch it's:

image

)

(TakeAllBut works like bx.shape[:-outer_batch_dims], so their is "an extra -" in TorchSharp.)

yueyinqiu commented 3 months ago

@yueyinqiu, your covMat is missing a 1-sized dimension, I believe.

Ah... yes, and it could also cause the problem.

NiklasGustafsson commented 3 months ago

That caused the different exception, but there's a bug somewhere in BatchMahalanobis

If anyone would like to help me stare at the code for a while, you are more than welcome to! :-)

yueyinqiu commented 3 months ago

@NiklasGustafsson

I have found the problem but I don't have much time to fix and test for it currently... https://github.com/dotnet/TorchSharp/issues/1334#issuecomment-2181206978

NiklasGustafsson commented 3 months ago

Line #221:

var bx_new_shape = TakeAllBut(bx.shape, outer_batch_dims).ToList();

should be:

var bx_new_shape = bx.shape.Take(outer_batch_dims).ToList();
NiklasGustafsson commented 3 months ago

It's probably worth making a new release for this.

NiklasGustafsson commented 3 months ago

@Kermalis, I don't know if you have the time and energy to copy the MultivariateNormal distribution to your own project (place it in a different namespace) and apply the fix there to verify it the context of PPO. If not, I'll make a release based on the fix working with the small repro @yueyinqiu posted.

Kermalis commented 3 months ago

Hello, I'm really grateful you guys took the time to look into this and got to the bottom of it so quickly :)

I'm fine with either solution honestly, but a new nuget release would be of course better. It looks like it might be worth it since I think there are quite a few pending changes on github compared to nuget

NiklasGustafsson commented 3 months ago

Hello, I'm really grateful you guys took the time to look into this and got to the bottom of it so quickly :)

I'm fine with either solution honestly, but a new nuget release would be of course better. It looks like it might be worth it since I think there are quite a few pending changes on github compared to nuget

Yes, a new release will happen. I would prefer to figure out whether this fix is enough, or whether there are further problems downstream from it. Since you have the ported PPO code, it would be valuable if you could test it in that context. If you don't have the time, I understand and will release without that deeper validation.

Kermalis commented 3 months ago

I should be able to try it out soon, I'll let you know how it goes

Kermalis commented 3 months ago

I just verified that the fix works on my end and I'm now able to train :) Thanks a bunch. I'll be looking forward to the next nuget releases

NiklasGustafsson commented 3 months ago

Okay. I'll merge and then do a release this afternoon.