Previously, pad_embeds was incorrectly constructed by repeating the pad_embed tensor along the wrong dimension, leading to a size mismatch when attempting to concatenate it with inputs['inputs_embeds'].
The error message is as follows:
Process Process-1:
Traceback (most recent call last):
File "/ML-A100/team/mm/shuyu/anaconda3/envs/intern_clean/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/ML-A100/team/mm/shuyu/anaconda3/envs/intern_clean/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/ML-A100/team/mm/shuyu/workspace/projects/InternLM-XComposer/cap_train.py", line 159, in inferCaptionsAndSave
inputs = torch.cat([pad_embeds, inputs['inputs_embeds']], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 57 but got size 1 for tensor number 1 in the list.
FINISHED!
Modification: specifying the dimension as 1 when preparing pad_embeds.
This issue was not triggered by the official examples due to the difference in token counts across batches is 1 .Therefore I increase the difference in token numbers between the two examples
Previously,
pad_embeds
was incorrectly constructed by repeating thepad_embed
tensor along the wrong dimension, leading to a size mismatch when attempting to concatenate it withinputs['inputs_embeds']
. The error message is as follows:Modification: specifying the dimension as 1 when preparing pad_embeds.
This issue was not triggered by the official examples due to the difference in token counts across batches is 1 .Therefore I increase the difference in token numbers between the two examples