About CIL - Githubissues

JiazuoYu / MoE-Adapters4CL

Code for paper "Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters" CVPR2024

119 stars 6 forks source link

Hi, thanks for your attention to our work. For class-incremental learning (CIL) tasks, when only 1 route and 2 experts are used without a frozen activation strategy, repeated learning of network parameters indeed leads to forgetting. However, we were surprised to find that as the number of experts increases, the routing tends to learn specific expert combinations for similar classes. We think it alleviates the parameter changes caused by jointly training classes of significantly different distributions within the same group of experts, thereby alleviating the forgetting phenomenon. This finding also indirectly demonstrates that expert models have some inherent advantages in handling different distribution data and incremental learning tasks.

JiazuoYu / MoE-Adapters4CL

About CIL #9