Closed TitleZ99 closed 2 years ago
Hi, we think that this observation suggests that prompt-tuning is maybe more suitable for transformer-based pre-trained vision model for the evaluated 24 tasks, comparing to other parameter-efficient methods we included.
We also tried to find the nearest image patches with the learnt prompt embeddings, but we couldn't find any semantic meaningful results.
Close the issue for now. Feel free to re-open if you have other questions!
Thanks for this wonderful work. The paper contents a lot of details but i still i want to know why the learned visual prompt can achieve such good performance even outperform the Full. i am confused about it and wondering you can help solve this problem.
Sent from PPHub