Prediction and target not the same shape

sam-j-hall commented 7 months ago

Hi,

I am trying to set up ML model with Schnetpack to predict x-ray spectroscopy and have set up a initial basic model in order to get things working before improving. However, I am having some trouble with the prediction and target tensors not being the same shape, with the problem being that the prediction tensor shape seems to be changing and I can't seem to figure out or understand what is going on?

The error message I get is Predictions and targets are expected to have the same shape, but got torch.Size([1, 200]) and torch.Size([200])

What I don't understand is why the prediction shape is [1,200], because the data that I have loaded in the dataset has a shape [200] (as seen when I print the dataset keys under Spectrum).

_idx - torch.Size([1])
Spectrum - torch.Size([200])
_n_atoms - torch.Size([1])
_atomic_numbers - torch.Size([40])
_positions - torch.Size([40, 3])
_cell - torch.Size([1, 3, 3])
_pbc - torch.Size([3])

The problem occurs when calculating the validation error, which I am also confused about because during training I get the same UserWarning about the different shapes but somehow the code does not stop here. I have attached my files of my code here for you to have a look and see if there is anything I am doing wrong that is causing this and what the fix would be.

Thanks

files.zip

stefaanhessmann commented 7 months ago

Hi @sam-j-hall , as the error message suggests, the shapes of the properties that your model predicts does not match the properties in your database. Unsqueezing the dimension of the property "Spectrum" in your database file from [200] to [1, 200] should solve the issue.

Let me know, if this helps to solve the issue.|

Best, Stefaan

sam-j-hall commented 6 months ago

Hi @Stefaanhess,

Thank you for you help, I tried changing the dimension of the "Spectrum" property to [1,200] but then when I run in to the problem given below:

File ~/anaconda3/envs/pyg/lib/python3.9/site-packages/schnetpack/data/stats.py:73 in calculate_stats(dataloader, divide_by_atoms, atomref)
     [70] sample_m2 = torch.sum((sample_values - sample_mean[:, None]) ** 2, dim=1)
     [72] delta = sample_mean - mean
---> [73] mean += delta * batch_size / new_count
     [74] corr = batch_size * count / new_count
     [75] M2 += sample_m2 + delta**2 * corr

RuntimeError: output with shape [1] doesn't match the broadcast shape [1, 200]

stefaanhessmann commented 6 months ago

Hi @sam-j-hall,

it seems like currently removing the mean only works for properties of shape [1]. This is mainly, because removing the mean is usually just used for scalar properties.

I suggest you just try to set remove_mean=False during training.

sam-j-hall commented 6 months ago

Hi @Stefaanhess,

I have managed to get it working now as I did not need to use RemoveOffsets, I only had this as I copied it over from the tutorials and after your explanation on remove_mean made me realize and look into this and found what I was doing wrong.

Thank you for your help

atomistic-machine-learning / schnetpack

Prediction and target not the same shape #624