There are 5 models to be trained by DINet: Syncnet, Frame64, Frame128, Frame256 and Clip256. If you want to fine-tune a set of base models, which models need to be fine-tuned?
Syncnet is relatively independent and can be fine-tuned directly based on the base model Syncnet
Frame uses a coarse-to-fine training method. I think Frame64 can be fine-tuned directly based on the Frame256 base model. Later, I found that the net_g of Frame and Clip256 are the same. I feel that it may be better to fine-tune Frame64 based on the Clip256 base model. , so that more information in the base model can be retained?
Clip256 needs to be trained from scratch based on Syncnet and Frame256
I don’t know if my understanding is correct, everyone is welcome to discuss it together.
DINet要训练的模型有 5个:Syncnet、Frame64、Frame128、Frame256和Clip256。如果想在一组底模上微调,需要微调哪几个模型呢?
不知道我的理解对不对,欢迎大家一起讨论哈
There are 5 models to be trained by DINet: Syncnet, Frame64, Frame128, Frame256 and Clip256. If you want to fine-tune a set of base models, which models need to be fine-tuned?
I don’t know if my understanding is correct, everyone is welcome to discuss it together.