invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces).

zeyayin commented 5 years ago

Hello I meet an issue when I try to train new model on wins with python 0.4.1 .

with command !python scripts/train.py --noise_dim=0

I remove all the cuda() in order to run on cpu but I meet this issue:

Traceback (most recent call last): File "scripts/train.py", line 580, in main(args) File "scripts/train.py", line 245, in main optimizer_d) File "scripts/train.py", line 371, in discriminator_step generator_out = generator(obs_traj, obs_traj_rel, seq_start_end) File "C:\Users\zeya\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call result = self.forward(*input, *kwargs) File "C:\Users\zeya\sgan-master\scripts\sgan\models.py", line 508, in forward final_encoder_h = self.encoder(obs_traj_rel) File "C:\Users\zeya\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call result = self.forward(input, **kwargs) File "C:\Users\zeya\sgan-master\scripts\sgan\models.py", line 63, in forward obs_traj_embedding = self.spatial_embedding(obs_traj.view(-1, 2)) RuntimeError: invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at c:\new-builder_3\win-wheel\pytorch\aten\src\th\generic/THTensor.cpp:237

angeliand commented 5 years ago

Hi! Could you provide the script you have run, so that the issue can be recreated?

zeyayin commented 5 years ago

Hi! Could you provide the script you have run, so that the issue can be recreated?

Hi, I change my issue but not quite sure about your meaning of providing the script, Is the form of issue as what you expected now?

angeliand commented 5 years ago

This was exactly what I wanted, thank you! This is weird though, as it runs just fine for me. I would try different things if I were you:

you could try editing the code in sgan\models.py line 63. to obs_traj_embedding = self.spatial_embedding(obs_traj.contiguous().view(-1, 2)) or
you could install the exact requirements , as this can cause problems too.

zeyayin commented 5 years ago

After I change the model.py line 63 and run this command !python scripts/train.py --noise_dim= It just keep running or even got stuck without giving any response. I will try it on desktop PC later.

zeyayin commented 5 years ago

This was exactly what I wanted, thank you! This is weird though, as it runs just fine for me. I would try different things if I were you:

* you could try editing the code in [sgan\models.py line 63](https://github.com/agrimgupta92/sgan/blob/master/sgan/models.py#L63). to
  `obs_traj_embedding = self.spatial_embedding(obs_traj.contiguous().view(-1, 2))`
  or

* you could install the exact [requirements ](https://github.com/agrimgupta92/sgan/blob/master/requirements.txt), as this can cause problems too.

Hello after I add the contiguous() and run on destop, it begins to ask for updating of cuda driver version even I have delete all cuda(). Is there anyway I can run on CPU instead of gpu

xieshuaix commented 5 years ago

The issue arises from permuting mini-batch in seq_collate(data) of the data loader without calling contiguous() afterwards; tensor.permute breaks contiguity of a tensor and calling view on it raises an error. Training and testing on GPU are not subjected to this issue because tensor.cuda() automatically makes the tensor contiguous. Using reshape() instead of view() may fix this problem as well. These can be seen from below:

obs_traj.view(-1, 2).shape = {RuntimeError}invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at /pytorch/aten/src/TH/generic/THTensor.cpp:213
obs_traj.numpy().strides = {tuple} <class 'tuple'>: (4, 64, 32)
obs_traj.contiguous().view(-1, 2).shape = {Size} torch.Size([1304, 2])
obs_traj.contiguous().numpy().strides = {tuple} <class 'tuple'>: (1304, 8, 4)
obs_traj.cuda().cpu().numpy().shape = {tuple} <class 'tuple'>: (8, 163, 2)
obs_traj.cuda().cpu().numpy().strides = {tuple} <class 'tuple'>: (1304, 8, 4)
obs_traj.reshape(-1, 2).numpy().shape = {tuple} <class 'tuple'>: (1304, 2)
obs_traj.reshape(-1, 2).numpy().strides = {tuple} <class 'tuple'>: (8, 4)

davidglavas commented 5 years ago

In models.py:

change line ~67 to: obs_traj_embedding = self.spatial_embedding(obs_traj.contiguous().view(-1, 2))
change line ~221 to: curr_hidden = h_states.contiguous().view(-1, self.h_dim)[start:end]

This resolved the issue for me.

agrimgupta92 / sgan

invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). #22