Open chufanchen opened 7 months ago
Regularization-based:computes importance values of either parameters or their gradients on previous tasks, and adds a regularization in the loss to restrict changes to those important parameters
Regularization-based methods have difficulty to prevent CF
Memory-based: a small memory buffer to store data of previous tasks and replay them in learning a new task to prevent CF
Parameter isolation: learn to mask a sub-network for each task in a shared network. E.g. HAT, SupSup
poor KT
the importance of a parameter to a task is computed based on its gradient
https://arxiv.org/abs/2306.14775
https://github.com/UIC-Liu-Lab/SPG