PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models
https://arxiv.org/abs/2401.15947
Apache License 2.0
1.9k stars 121 forks source link

[Usage] tokenizer.pad_token_id == None? #69

Open sjtu-cz opened 4 months ago

sjtu-cz commented 4 months ago

Describe the issue

Issue:

Environment:

GPU: 8×A100-80G or others?
Deepspeed version:
Torch version:
Transformers version:
Tokenizers version:

Command:

I use the command (https://github.com/PKU-YuanGroup/MoE-LLaVA/blob/main/scripts/v1/qwen/pretrain.sh)

Log:

MoE-LLaVA/moellava/train/train.py", line 1076, in __call__
2024-04-18T04:05:18.004375414Z     input_ids = torch.nn.utils.rnn.pad_sequence(
2024-04-18T04:05:18.004377127Z   File "/opt/conda/envs/llava1.5/lib/python3.10/site-packages/torch/nn/utils/rnn.py", line 399, in pad_sequence
2024-04-18T04:05:18.004379772Z     return torch._C._nn.pad_sequence(sequences, batch_first, padding_value)
2024-04-18T04:05:18.004382177Z TypeError: pad_sequence(): argument 'padding_value' (position 3) must be float, not NoneType
zhaozhipeng1997 commented 4 months ago

按照https://github.com/haotian-liu/LLaVA/issues/1167,改为self.tokenizer.pad_token_id=-100