Sample Error - Githubissues

luo3300612 commented 4 months ago

Two days ago, I train a Dit-XL with the following command:

torchrun --nproc_per_node=8 src/train.py \
  --model DiT-XL/122 \
  --vae ucf101_stride4x4x4 \
  --data-path ./UCF-101 --num-classes 101 \
  --sample-rate 2 --num-frames 8 --max-image-size 128 --clip-grad-norm 1 \
  --epochs 14000 --global-batch-size 64 --lr 1e-4 \
  --ckpt-every 4000 --log-every 1000 \
  --results-dir ./exp1

Today, I try to sample a video through:

python opensora/sample/sample.py \
  --model DiT-XL/122 --ae ucf101_stride4x4x4 \
  --ckpt ./exp1/000-DiT-XL-122/checkpoints/0012000.pt --extras 1 \
  --fps 10 --num-frames 16 --image-size 256

However, I met

    model.load_state_dict(state_dict)
  File "/root/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DiT:
        Unexpected key(s) in state_dict: "y_embedder.embedding_table.weight".

Thank you for taking the time to look into this issue. I look forward to your response.

LinB203 commented 4 months ago

Fixed that. Use --extras 1 to advoid it.

https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/main/opensora/models/diffusion/dit/dit.py#L239

junwenxiong commented 4 months ago

It seems to ignore the attention_mask used in DiT forward function.

  File "/mnt/workspace/Text-to-Video/Open-Sora-Plan/opensora/models/diffusion/diffusion/respace.py", line 130, in __call__
    return self.model(x, new_ts, **kwargs)
TypeError: forward() missing 1 required positional argument: 'attention_mask'

https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/f1542802351a3df5c9c66732db2d265d9e49c525/opensora/sample/sample.py#L80

LinB203 commented 4 months ago

It seems to ignore the attention_mask used in DiT forward function.
  File "/mnt/workspace/Text-to-Video/Open-Sora-Plan/opensora/models/diffusion/diffusion/respace.py", line 130, in __call__
    return self.model(x, new_ts, **kwargs)
TypeError: forward() missing 1 required positional argument: 'attention_mask'
https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/f1542802351a3df5c9c66732db2d265d9e49c525/opensora/sample/sample.py#L80

Fixed that.

xinyuxiao commented 4 months ago

if you have inference results, how is their quality, can you show some cases?

LinB203 commented 3 months ago

if you have inference results, how is their quality, can you show some cases?

See https://github.com/PKU-YuanGroup/Open-Sora-Plan/tree/main?tab=readme-ov-file#sampling

PKU-YuanGroup / Open-Sora-Plan

Sample Error #77