Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.
Apache License 2.0
1.44k stars 147 forks source link

Extra key in ucf101.pt #56

Open wang-muhan opened 3 months ago

wang-muhan commented 3 months ago

When I'm loading the pretrained ucf101 checkpoint, the error appears: RuntimeError: Error(s) in loading state_dict for Latte: Unexpected key(s) in state_dict: "y_embedder.embedding_table.weight". All the other checkpoints can generate videos successfully, can you check this? Thank you

maxin-cn commented 3 months ago

When I'm loading the pretrained ucf101 checkpoint, the error appears: RuntimeError: Error(s) in loading state_dict for Latte: Unexpected key(s) in state_dict: "y_embedder.embedding_table.weight". All the other checkpoints can generate videos successfully, can you check this? Thank you

Hi, can I confirm whether the extras parameter is 2 when you use the ucf101 pre-train model? See: https://github.com/Vchitect/Latte/blob/9ededbe590a5439b6e7013d00fbe30e6c9b674b8/configs/ucf101/ucf101_sample.yaml#L14

wang-muhan commented 3 months ago

Do you have pretrained models for unconditional generation?

maxin-cn commented 3 months ago

Do you have pretrained models for unconditional generation?

All pre-trained models are unconditional generation except ucf101 and t2v.

wang-muhan commented 3 months ago

Thanks. How can I control the class being sampled, is it "sample_names" in the ucf101_sample.yaml? But I didn't found where you reference this value in your code

maxin-cn commented 3 months ago

Thanks. How can I control the class being sampled, is it "sample_names" in the ucf101_sample.yaml? But I didn't found where you reference this value in your code

You can change the class label in here, https://github.com/Vchitect/Latte/blob/9ededbe590a5439b6e7013d00fbe30e6c9b674b8/sample/sample.py#L90

wang-muhan commented 3 months ago

What's you training loss after converge? I want to reproduce but met some problem.

maxin-cn commented 3 months ago

What's you training loss after converge? I want to reproduce but met some problem.

Training loss decreases quickly and oscillates around a value.