I went through your paper and found it very interesting. One question I have is whether you did some ablation studies about the role of prompt plays.
The current way your model did is using a temple: The sentence of "X" means [MASK]. And you use the mask representation as the input to the last layer.
However, the traditional way is directly using X's representation (e.g., cls's vector) to do the downstream tasks.
I wonder if you have any insights about the benefits of reformulating the traditional way into PromptBERT's way?
Hi,
I went through your paper and found it very interesting. One question I have is whether you did some ablation studies about the role of prompt plays.
The current way your model did is using a temple: The sentence of "X" means [MASK]. And you use the mask representation as the input to the last layer.
However, the traditional way is directly using X's representation (e.g., cls's vector) to do the downstream tasks.
I wonder if you have any insights about the benefits of reformulating the traditional way into PromptBERT's way?
Many thanks!