mihirp1998 / VADER

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.
https://vader-vid.github.io/
211 stars 14 forks source link

Incompatitable of compression reward model pt and parameter #16

Closed CIntellifusion closed 4 days ago

CIntellifusion commented 5 days ago

The state dict of the reward_model.pt has following state dict. layers.0.weight torch.Size([512, 768]) layers.0.bias torch.Size([512]) layers.3.weight torch.Size([128, 512]) layers.3.bias torch.Size([128]) layers.6.weight torch.Size([32, 128]) layers.6.bias torch.Size([32]) layers.9.weight torch.Size([1, 32]) layers.9.bias torch.Size([1])

But the code is compression_scorer.py define a 5-layer MLP. https://github.com/VideoVerses/VideoTuna/blob/c12a04ea5d0b4f5e69b41f960df8267911c41b61/src/lvdm/models/rlhf_utils/compression_scorer.py#L39

I customed the model arch as follows:

image

Then I got a reward around [20,50]. Is this align with your results?

Or should I open a pr to solve this??

Thanks for your work.

QinOwen commented 4 days ago

Hi. Thanks for pointing it out. Yes, this should be consistent with our results. We modified and trained compression reward models multiple times with different architectures, accidentally causing inconsistencies in weights and this code. Please feel free to explore more suitable compression reward designs. As my experience, the performance of reward_model.pt is not very well.

CIntellifusion commented 4 days ago

Thanks for you reply. I have reproduced your result on vc2 with aesthetic reward which seems fine.

------------------ Original ------------------ From: Zheyang Qin @.> Date: Thu,Oct 31,2024 2:59 PM To: mihirp1998/VADER @.> Cc: HaoyuWu556 @.>, Author @.> Subject: Re: [mihirp1998/VADER] Incompatitable of compression reward model ptand parameter (Issue #16)