I try to train the model with tsn feature.But it only use 2GB GPU memory.So I try to train the model with bitch_size = 8.But there are some error like:
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [41,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [41,0,0], thread: [1,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [41,0,0], thread: [2,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
0%| | 0/2502 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 317, in <module>
train(opt)
File "train.py", line 181, in train
output, loss = model(dt, criterion, opt.transformer_input_type)
File "/home/anaconda3/envs/PDVC-main/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/media/axxddzh/dat/axxddzh/PDVC-main/models/pdvc.py", line 166, in forward
disable_iterative_refine)
File "/media/axxddzh/dat/axxddzh/PDVC-main/models/pdvc.py", line 299, in parallel_prediction_matched
others, self.opt.caption_decoder_type, indices)
File "/media/axxddzh/dat/axxddzh/PDVC-main/models/pdvc.py", line 387, in caption_prediction
cap_prob = cap_head(hs[:, feat_bigids], reference[:, feat_bigids], others, seq)
RuntimeError: CUDA error: device-side assert triggered
I have met the same problem when Batch_size not be 1
Hi, in this code, the standard captioning module (PDVC) doesn't support batch size > 1, but PDVC_light does. I tried to train PDVC_light with a larger batch size but got a slight performance drop.
I try to train the model with tsn feature.But it only use 2GB GPU memory.So I try to train the model with bitch_size = 8.But there are some error like:
I have met the same problem when Batch_size not be 1