Open huofushuo opened 1 year ago
I am also confused. What are the experiment settings for Figure 3 in this paper?
I think it's because although the computation graph is simplified, it introduces more learnable parameters at the end of the task head. And I ran some experiments and found that this method has a slightly worse performance than methods such as adapter, with a performance gap of about 6% from the best performance in my experiments.
Hello, excellent work! I run the VQT, why the GPU memory does not reduce compared to VPT? Thanks and hope for the anwser!