Open zhouchang123 opened 1 month ago
Q1: How to get the scores through GPT-Neo-2.7B? By calculating the task metric score of each input concatenation of prompt + testing input, see Section 3.2.
Q2: In which procedure, the prompt get positive or negative, after get the scores or after encode before score?
After getting the scores. For all the scored prompts for a training example, we label the prompt with the highest score as positive. For negative samples, we randomly sample B training demonstrations from the prompt pool, in addition, we label B demonstrations corresponding to the lowest B scores in the sampled prompts as hard negatives
, details are in Section 3.2.
What about the score through prompt retriever? Is the similarity of the two vectors after encoder? Thanks very much.
You may refer to Section 3.4 to see how we get the score after tuning the prompt retriever.
Section 3.4 introduced the inference part? It is the same in training pipline ?
Training is in Section 3.3, you may refer to the provided code as well.
Section 3.3 only introduce sim(x, p) ,do you mean sim(x, p) is the score ?
Yes, sim(x, p) is the score.
In paper,the positive prompt number is 1 and negative prompt number is 20.But not demonstrate the total number of prompts in one train epoch . What will happen if the prompts not positive or negative? To the prompts not positive or negative,InfoNCE seems not include these prompts.
Yes, InfoNCE would not consider the prompts that are neither positive nor negative.
I found some confusion about the pipline of training and inferencing. In training pipline, the input is include the task name and the query and the metric considerates the task. However when inferencing,the input is only the query without task name. So could add a module that according to the query to clarify its task name,and first filter the task name then retriever? @cdxeve
We do not input the task name during training, and the task name in the image is only for ease of understanding. You may refer to the formula in section 3.2 for details.
I viewed the file prompt_pool.json and each dict is annotated to different task name.So the task name is only to divide to its metric score? The normal state of mind when retrieving is to retriever in the prompts of similar task rather than all the prompts.
Q1: Is the task name only used to divide it by metric score?
A1: We keep the task name in the metadata to support many potential uses, but we don’t include it as input during training.
Q2: The normal state of mind when retrieving is to retriever in the prompts of similar task rather than all the prompts. A1: You could try this for a quick test, but I think the diversity will be too constrained since the number of tasks is much smaller than the number of demonstrations in the prompt pool.
1.How to get the scores through GPT-Neo-2.7B? 2.In which procedure,the prompt get positive or negative,after get the scores or after encode before score?