Non-consecutive added token '<extra_id_99>' found.

Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.

Apache License 2.0

1.65k stars 177 forks source link

Closed heatingma closed 7 months ago

heatingma commented 7 months ago

When the code was executed to

text_encoder = T5EncoderModel.from_pretrained(
    args.pretrained_model_path, subfolder="text_encoder", 
    torch_dtype=torch.float16
).to(device)

the following error occurred

ValueError: 
Non-consecutive added token '<extra_id_99>' found.
Should have index 32100 but has index 32000 in saved vocabulary.

What is the reason for this? Is it because the t2v_required_models/tokenizer/spiece.model file on the hugging face is outdated?

heatingma commented 7 months ago

sorry, the code is

tokenizer = T5Tokenizer.from_pretrained(
    args.pretrained_model_path, 
    subfolder="tokenizer"
)

heatingma commented 7 months ago

I have resolved this issue by upgrading the transformer version from 4.31.0 to 4.37.0