Why is magvitv2 different from the description in the paper? Am I understanding it wrong? - Githubissues

lucidrains / magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

MIT License

565 stars 34 forks source link

Why is magvitv2 different from the description in the paper? Am I understanding it wrong? #40

Open hefeicyp opened 6 months ago

hefeicyp commented 6 months ago

Why is magvitv2 different from the description in the paper? Am I understanding it wrong?

hefeicyp commented 5 months ago

Yes. I implemented it described in the paper as much as possible. The effect looks normal

sen-ye commented 5 months ago

Hello, I'm interested in how you implement the group norm in the MAGVIT-V2 paper. Did you directly apply group norm to a video tensor?

hefeicyp commented 5 months ago

Please refer directly to Tencent’s implementation： https://github.com/TencentARC/Open-MAGVIT2

sen-ye commented 5 months ago

Thanks~

shinshiner commented 4 months ago

Please refer directly to Tencent’s implementation： https://github.com/TencentARC/Open-MAGVIT2

This implementation still seems to be large different from the original paper, and it only use a small model to train an image tokenizer. As you said you implemented it, how about the evaluation results? i.e. imagenet/ucf101 reconstruction? Is there any chance to communicate with you? @hefeicyp

Jason3900 commented 4 days ago

Please refer directly to Tencent’s implementation： https://github.com/TencentARC/Open-MAGVIT2

This implementation still seems to be large different from the original paper, and it only use a small model to train an image tokenizer. As you said you implemented it, how about the evaluation results? i.e. imagenet/ucf101 reconstruction? Is there any chance to communicate with you? @hefeicyp

Hey, we've implemented a version that's almost perfectly aligned with the original paper. You can check it for more details. https://github.com/cofe-ai/O2-MAGVIT2