Closed shidingz closed 7 months ago
Hi, this error could be attributed to that the input of images_aux
is a list. Please check it. If you want to input with image sequence, please modify the implementation.
I later found out that it was because my batch size (bs) was set to 1. If it is 1, images_aux in the dataset is a list; torch.stack is only called when bs is greater than 1.
Thanks for this report. We fixed this issue in the current version by using torch.stack when batch size is 1.
Traceback (most recent call last): File "/checkpoint/binary/train_package/minigemini/train/train_mem.py", line 14, in
train(attn_implementation="flash_attention_2")
File "/checkpoint/binary/train_package/minigemini/train/train.py", line 1262, in train
trainer.train()
File "/root/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1624, in train
return inner_training_loop(
File "/root/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/root/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2902, in training_step
loss = self.compute_loss(model, inputs)
File "/root/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2925, in compute_loss
outputs = model(inputs)
File "/opt/conda/envs/python3.8.13/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/root/.local/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(args, kwargs)
File "/root/.local/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1833, in forward
loss = self.module(*inputs, *kwargs)
File "/opt/conda/envs/python3.8.13/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, **kwargs)
File "/checkpoint/binary/train_package/minigemini/model/language_model/mini_gemini_gemma.py", line 87, in forward
) = self.prepare_inputs_labels_for_multimodal(
File "/checkpoint/binary/train_package/minigemini/model/mini_gemini_arch.py", line 328, in prepare_inputs_labels_for_multimodal
image_features = self.encode_images(images, images_aux)
File "/checkpoint/binary/train_package/minigemini/model/mini_gemini_arch.py", line 255, in encode_images
image_aux_features_raw = self.get_model().get_vision_tower_aux()(images_aux).to(
AttributeError: 'list' object has no attribute 'to'