Open prachigarg23 opened 1 year ago
for question number 1, I also encountered same question. did you solved it? I think we do not need transfer model.prompt.prompt[cur_idx] = model.prompt.prompt[prev_idx]. when we training net, it exclusively select prompt inside of prompt pool based on task number. and when inference it automatically select closest top prompt from prompt pool.
Hi @JH-LEE-KR, thanks for this amazing Pytorch implementation of L2P. I have the following doubts in the code:
In
engine.py > train_and_evaluate()
: Transfer previous learned prompt params to the new prompt. I am confused about this - the top_k prompts used for any task will be overlapping as there aren't enough dedicated (mutually exclusive) prompts for each task. So why are we shifting the weights of prompts from prev_idx to cur_idx ?model.prompt.prompt[cur_idx] = model.prompt.prompt[prev_idx]
Based on my understanding, if the prompt pool size is 10, then the 10 prompts will be common/shared across all tasks and at every batch training, top k (5 prompts) will get updated based on query function. Kindly help me understand this.Regarding the usage of train_mask and class_mask: Does L2P not initialize its own classifier for every new task (that has a union of all classes seen till that task)? Then why do we need to mask out certain classes just before loss computation?