Closed ArEnSc closed 2 years ago
Hi!
We has seen some successes in using Prompt Tuning as a way to mitigate catastrophic forgetting, for example in this paper we find Prompt Tuning can help in cross-lingual zero-shot generation where the models tendency to output English (after training on English) even when the input is in a different language is a type of over-fitting.
I don't exactly know what you mean by predicting the next task? Generally you know what task is being done so you can make a decision to swap out which prompt is being use? For example in the above paper we have "factored prompts" where there is a collection of language and a collection of task prompts that are combined for new (language, task)
pairs. That isn't OSS yet but we have a similar method in the extended
section of the code base where there is are shared prompt parameters that are used for all tasks and then task specific parameters. Which task specific parameters to use are selected based on the current task and are trained jointly with the shared parameters.
If you don't want to make a hard, explicit different between prompts used for other tasks you could probably use an approach like this paper where the prompt is a function of the input (although I think it would have made more sense if they had the instance specific prompt and a shared prompt that were mixed together with a learned gating). You could also do something like have a huge number of prompt tokens (ex 1000) and then do some sort of attention between the input and them to reduce/select them down to a reasonable number that is then used as the prompt. Something like this should allow you to have prompts that combine shared and task specific information (without you needing to know the task).
If you are having trouble implementing anything in this codebase let me know and I can try to help point you to how it would be done.
@blester125
Thank you for the response and your and the team's amazing work. I am surprised there are not more frameworks implementing this as a staple of NLP!
to clearify I am trying to develop an autonomous agent that will be able to take user queries and apply the correct prompt associated with their query, generally in this scenario there would be a higher level classifier that would guide the model to choose the correct prompt by predicting it from the query.
I am going to try this paper to see if this works, but it seems like a good potiential direction I was looking for
https://homes.cs.washington.edu/~akari/papers/attempt_preprint.pdf
I will also take a look at that paper you referenced!
Thanks for sharing that reference! I'm glad someone finally did that, I've been thinking along the lines for that paper for a while but never had time!
@blester125 Does soft prompt tuning imply that we can avoid catastrophic forgetting that occurs in multitasking? To do this using a classifier or a series of action codes to predict the next task to use? Let me know what you think! and what you have seen. I am still working through a way to do this with soft prompt tune with pytorch and experiment.