RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1

zhanhl316 commented 3 years ago

Training of Epoch 0: GPU 0 will process 591616 data in 2311 iterations. 0%| | 0/2311 [00:31<?, ?it/s] Traceback (most recent call last): File "xmatching/main.py", line 313, in main() File "xmatching/main.py", line 43, in main mp.spawn(train, nprocs=args.gpus, args=(args,)) File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes while not context.join(): File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 119, in join raise Exception(msg) Exception:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap fn(i, *args) File "/home/zhanhaolan/codes/vokenization/xmatching/main.py", line 233, in train for i, (uid, lang_input, visn_input) in enumerate(tqdm.tqdm(train_loader, disable=(gpu!=0))): File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/tqdm/std.py", line 1167, in iter for obj in iterable: File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in next data = self._next_data() File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data return self._process_data(data) File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data data.reraise() File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate return [default_collate(samples) for samples in transposed] File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in return [default_collate(samples) for samples in transposed] File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate return [default_collate(samples) for samples in transposed] File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in return [default_collate(samples) for samples in transposed] File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1

Hi, Do you have any idea about this issue?

airsplay commented 3 years ago

I think that it is a problem caused by a higher version of HuggingFace's transformer version (especially for the tokenizers). Could you help try to downgrade it to transformers == 3.3 (this is the version when I released the code and test the scripts on)?

zhanhl316 commented 3 years ago

@airsplay Great! I have resloved this problem when I change the Huggingface's transformers to version 3.3.0. I suggest that you could revise the requirements file, where the treansformers version is still 2.7.0 ( transformers 2.7.0 -->3.3.0). Best,

airsplay commented 3 years ago

Thanks. I will change it.

airsplay / vokenization

RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1 #2