Training dimension error for supervised SIMCSE

ko120 commented 10 months ago

I think there is something wrong with padding during training. I am keep getting this error. Could you help me with this issue? 50 outputs = model(*inputs) 51 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl 52 return forward_call(args, kwargs) 53 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 172, in forward 54 return self.gather(outputs, self.output_device) 55 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 184, in gather 56 return gather(outputs, output_device, dim=self.dim) 57 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 86, in gather 58 res = gather_map(outputs) 59 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 77, in gather_map 60 return type(out)((k, gather_map([d[k] for d in outputs])) 61 File "", line 7, in init 62 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/transformers/file_utils.py", line 1383, in __post_init__ 63 for element in iterator: 64 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 77, in 65 return type(out)((k, gather_map([d[k] for d in outputs])) 66 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 71, in gather_map 67 return Gather.apply(target_device, dim, outputs) 68 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/autograd/function.py", line 506, in apply 69 return super().apply(args, kwargs) # type: ignore[misc] 70 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/_functions.py", line 75, in forward 71 return comm.gather(inputs, ctx.dim, ctx.target_device) 72 File "/home/ko120/anaconda3/envs/nlp_project/lib/python3.8/site-packages/torch/nn/parallel/comm.py", line 235, in gather 73 return torch._C._gather(tensors, dim, destination) 74 RuntimeError: Input tensor at index 1 has invalid shape [72, 144], but expected [72, 146]

github-actions[bot] commented 9 months ago

Stale issue message

gaotianyu1350 commented 9 months ago

Hi,

Sorry for the late reply. Are you using single GPU or multi GPU training? Note that they have different running files and using the wrong one may cause this error.

github-actions[bot] commented 8 months ago

Stale issue message

princeton-nlp / SimCSE

Training dimension error for supervised SIMCSE #251