Closed lemon-prog123 closed 1 year ago
To use pretrained weight from image model, we proposed dual-channel attention in our paper. It's a small modification in the transformer structure and can be applied in both autoregressive and non-autoregressive settings. You can directly try this in the classification task, and freeze one of the channels to a image classification model.
Thank you for your advice. I'll try it then.
Hi ! I've read your paper. It's really a interesting job. Now I'm interested in the method you use in using pretrained weight from image model. I also want to try this method in my task. But It seems that your architecture is designed for autoregressive task, but I want to use it in a video classification task.
I wonder that would you like to give me some advice in finding a proper way to use image model's pretrained weight in a video task of transformer architecture.