lucidrains / magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch
MIT License
541 stars 35 forks source link

Why is magvitv2 different from the description in the paper? Am I understanding it wrong? #40

Open hefeicyp opened 4 months ago

hefeicyp commented 4 months ago

Why is magvitv2 different from the description in the paper? Am I understanding it wrong?

hefeicyp commented 3 months ago

Yes. I implemented it described in the paper as much as possible. The effect looks normal

sen-ye commented 3 months ago

Hello, I'm interested in how you implement the group norm in the MAGVIT-V2 paper. Did you directly apply group norm to a video tensor?

hefeicyp commented 3 months ago

Please refer directly to Tencent’s implementation: https://github.com/TencentARC/Open-MAGVIT2

sen-ye commented 3 months ago

Thanks~

shinshiner commented 3 months ago

Please refer directly to Tencent’s implementation: https://github.com/TencentARC/Open-MAGVIT2

This implementation still seems to be large different from the original paper, and it only use a small model to train an image tokenizer. As you said you implemented it, how about the evaluation results? i.e. imagenet/ucf101 reconstruction? Is there any chance to communicate with you? @hefeicyp