thanks for your great work! I have a question about Figure 4 in the paper.
In the paper you said that "For a prompt with shape L×D, we treat it as L prompts with dimension D.", and in Figure 4, each point represents a prompt vector of D=768.
But both general and task-specific prompts are inserted to multiple layers of the model. I'm wondering prompts of which layer are used for visualization in Figure 4? (From the number of points it seems only prompts from one certain layer are visualized) Is there any special reason for selecting prompts of a certain layer for visualization?
Looking forward to your reply and thanks in advance!
Hi,
thanks for your great work! I have a question about Figure 4 in the paper.
In the paper you said that "For a prompt with shape L×D, we treat it as L prompts with dimension D.", and in Figure 4, each point represents a prompt vector of D=768.
But both general and task-specific prompts are inserted to multiple layers of the model. I'm wondering prompts of which layer are used for visualization in Figure 4? (From the number of points it seems only prompts from one certain layer are visualized) Is there any special reason for selecting prompts of a certain layer for visualization?
Looking forward to your reply and thanks in advance!