[Usage] tokenizer.pad_token_id == None？

Describe the issue

Issue:

Environment:

GPU: 8×A100-80G or others?
Deepspeed version:
Torch version:
Transformers version:
Tokenizers version:

Command:

I use the command (https://github.com/PKU-YuanGroup/MoE-LLaVA/blob/main/scripts/v1/qwen/pretrain.sh)

Log:

MoE-LLaVA/moellava/train/train.py", line 1076, in __call__
2024-04-18T04:05:18.004375414Z     input_ids = torch.nn.utils.rnn.pad_sequence(
2024-04-18T04:05:18.004377127Z   File "/opt/conda/envs/llava1.5/lib/python3.10/site-packages/torch/nn/utils/rnn.py", line 399, in pad_sequence
2024-04-18T04:05:18.004379772Z     return torch._C._nn.pad_sequence(sequences, batch_first, padding_value)
2024-04-18T04:05:18.004382177Z TypeError: pad_sequence(): argument 'padding_value' (position 3) must be float, not NoneType

PKU-YuanGroup / MoE-LLaVA

[Usage] tokenizer.pad_token_id == None？ #69

Describe the issue