I encountered a new problem, running display model input dimension error, but no one said there is this error, I don't know which link is the problem. If you can help, thank you very much!
Read data: 0.0002810955047607422
/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py:582: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters(). (Triggered internally at /pytorch/aten/src/ATen/native/cudnn/RNN.cpp:775.)
self.dropout, self.training, self.bidirectional, self.batchfirst)
Save ckpt on exception ...
model saved to ./log/model.pth
Save ckpt done.
Traceback (most recent call last):
File "/data/data-2T/AI/ImageCaptioning/train.py", line 185, in train
model_out = dp_lw_model(fc_feats, att_feats, labels, masks, att_masks, data['gts'], torch.arange(0, len(data['gts'])), sc_flag, struc_flag, drop_worst_flag)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, kwargs)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, *kwargs)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, kwargs)
File "/data/data-2T/AI/ImageCaptioning/captioning/modules/loss_wrapper.py", line 47, in forward
loss = self.crit(self.model(fc_feats, att_feats, labels[..., :-1], att_masks), labels[..., 1:], masks[..., 1:], reduction=reduction)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _callimpl
result = self.forward(*input, **kwargs)
File "/data/data-2T/AI/ImageCaptioning/captioning/models/CaptionModel.py", line 33, in forward
return getattr(self, ''+mode)(*args, *kwargs)
File "/data/data-2T/AI/ImageCaptioning/captioning/models/ShowTellModel.py", line 81, in _forward
output, state = self.core(xt.unsqueeze(0), state)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, **kwargs)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 579, in forward
self.check_forward_args(input, hx, batch_sizes)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 530, in check_forward_args
self.check_input(input, batch_sizes)
File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 176, in check_input
expected_input_dim, input.dim()))
RuntimeError: input must have 3 dimensions, got 4
I encountered a new problem, running display model input dimension error, but no one said there is this error, I don't know which link is the problem. If you can help, thank you very much!
Read data: 0.0002810955047607422 /home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py:582: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters(). (Triggered internally at /pytorch/aten/src/ATen/native/cudnn/RNN.cpp:775.) self.dropout, self.training, self.bidirectional, self.batchfirst) Save ckpt on exception ... model saved to ./log/model.pth Save ckpt done. Traceback (most recent call last): File "/data/data-2T/AI/ImageCaptioning/train.py", line 185, in train model_out = dp_lw_model(fc_feats, att_feats, labels, masks, att_masks, data['gts'], torch.arange(0, len(data['gts'])), sc_flag, struc_flag, drop_worst_flag) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in replica 0 on device 0. Original Traceback (most recent call last): File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(*input, *kwargs) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, kwargs) File "/data/data-2T/AI/ImageCaptioning/captioning/modules/loss_wrapper.py", line 47, in forward loss = self.crit(self.model(fc_feats, att_feats, labels[..., :-1], att_masks), labels[..., 1:], masks[..., 1:], reduction=reduction) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _callimpl result = self.forward(*input, **kwargs) File "/data/data-2T/AI/ImageCaptioning/captioning/models/CaptionModel.py", line 33, in forward return getattr(self, ''+mode)(*args, *kwargs) File "/data/data-2T/AI/ImageCaptioning/captioning/models/ShowTellModel.py", line 81, in _forward output, state = self.core(xt.unsqueeze(0), state) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, **kwargs) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 579, in forward self.check_forward_args(input, hx, batch_sizes) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 530, in check_forward_args self.check_input(input, batch_sizes) File "/home/mm/anaconda3/envs/subgc/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 176, in check_input expected_input_dim, input.dim())) RuntimeError: input must have 3 dimensions, got 4