Closed leijue222 closed 3 years ago
I also tried SIMS, but the result is the same. failed... It may be wrong with bert, but I don't know exactly where bert is wrong
100%|██████████████████████████████████████| 43/43 [00:00<00:00, 126.09it/s]
0%| | 0/43 [00:00<?, ?it/s]
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [66,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
0%| | 0/43 [00:00<?, ?it/s]
Traceback (most recent call last):
File "run.py", line 300, in <module>
worker()
File "run.py", line 254, in worker
run_normal(args)
File "run.py", line 174, in run_normal
test_results = run(args)
File "run.py", line 79, in run
atio.do_train(model, dataloader)
File "/media/yiwei/yiwei-01/project/Emotion/MMSA-master/trains/multiTask/SELF_MM.py", line 143, in do_train
outputs = model(text, (audio, audio_lengths), (vision, vision_lengths))
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/yiwei-01/project/Emotion/MMSA-master/models/AMIO.py", line 50, in forward
return self.Model(text_x, audio_x, video_x)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/yiwei-01/project/Emotion/MMSA-master/models/multiTask/SELF_MM.py", line 64, in forward
text = self.text_model(text)[:,0,:]
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/yiwei-01/project/Emotion/MMSA-master/models/subNets/BertTextEncoder.py", line 59, in forward
token_type_ids=segment_ids)[0] # Models outputs are now tuples
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/transformers/modeling_bert.py", line 734, in forward
encoder_attention_mask=encoder_extended_attention_mask,
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/transformers/modeling_bert.py", line 407, in forward
hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/transformers/modeling_bert.py", line 368, in forward
self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/transformers/modeling_bert.py", line 314, in forward
hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/transformers/modeling_bert.py", line 216, in forward
mixed_query_layer = self.query(hidden_states)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/functional.py", line 1371, in linear
output = input.matmul(weight.t())
RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THC/THCGeneral.cpp:216
If I try to use cpu, the error is here: https://github.com/thuiar/MMSA/blob/b2e70bbd198ba8e8dc041f5e059c3baa2027b34a/models/multiTask/SELF_MM.py#L64
with error of:
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/functional.py", line 1467, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: index out of range: Tried to access index 1355 out of table with 1 rows. at /opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:237
I have succeeded with your previous version.
If I apply the old version that ran successfully to the current version, it still fails.
Or conversely, applying the new version of MISA to the old one is also a failure.
And the error message is the same: RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THC/THCGeneral.cpp:216
The error is located in BertTextEncoder.py : https://github.com/thuiar/MMSA/blob/b2e70bbd198ba8e8dc041f5e059c3baa2027b34a/models/subNets/BertTextEncoder.py#L47-L65
text:torch.Size([64, 39, 768]) | input_ids:torch.Size([64, 768]) | input_mask:torch.Size([64, 768]) | segment_ids: torch.Size([64, 768])
Error in line 62, I still don't know how to solve this problem. @iyuge2 @Columbine21
This issue has too much redundant information, I put the simplified content after debugging in issue#23.
Hi, @iyuge2 Thank you for your contribution to this project. I downloaded the data from the address you provided and ran it. It did achieve the same result as result-stat.
Then I download SIMS|MOSI|MOSEI raw data and use DatePre.py to generate
features.pkl
. But the processed data cannot be used forrun.py
.Let's take MOSI as an example(I download the raw data and use
DataPre.py
to process):aligned_50.pkl
(367.3MB) andunaligned_50.pkl
(554.2MB).*.pkl
for the MOSI dataset. Gotfeatures.pkl
2.8G, which is really bigger than your*.pkl
.Then I use
features.pkl
to run, but it failed...:sob:Failure situation:
Error of '
list' object has no attribute 'astype'
.But it didn't work, we will get a new error of :
RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THC/THCGeneral.cpp:216
There must be a problem with theDataPre.py
generate data. I really don't know how to solve it.For the DataPre.py file, I only modified the path so that the data can be found for processing, and I have not changed other places. @iyuge2 Please help me...