Open 1259666087 opened 1 month ago
Hi @1259666087, We first use the following function to warp the teacher model into a multi-branch structure https://github.com/HaitaoWen/CLearning/blob/ce0789a40bda9e566a1e0432d3ac320937ca48f0/scheme/replay/mtd/mtd.py#L779 then find appropriate multiple teachers in the procedure of class balanced finetuning https://github.com/HaitaoWen/CLearning/blob/ce0789a40bda9e566a1e0432d3ac320937ca48f0/scheme/replay/mtd/mtd.py#L979 then save this model in the 'memory' variable https://github.com/HaitaoWen/CLearning/blob/ce0789a40bda9e566a1e0432d3ac320937ca48f0/scheme/replay/mtd/mtd.py#L112 and use the model in the 'memory' for distillation during the new task. https://github.com/HaitaoWen/CLearning/blob/ce0789a40bda9e566a1e0432d3ac320937ca48f0/scheme/replay/mtd/mtd.py#L238
Hi @1259666087, We first use the following function to warp the teacher model into a multi-branch structure
then find appropriate multiple teachers in the procedure of class balanced finetuning https://github.com/HaitaoWen/CLearning/blob/ce0789a40bda9e566a1e0432d3ac320937ca48f0/scheme/replay/mtd/mtd.py#L979
then save this model in the 'memory' variable https://github.com/HaitaoWen/CLearning/blob/ce0789a40bda9e566a1e0432d3ac320937ca48f0/scheme/replay/mtd/mtd.py#L112
and use the model in the 'memory' for distillation during the new task. https://github.com/HaitaoWen/CLearning/blob/ce0789a40bda9e566a1e0432d3ac320937ca48f0/scheme/replay/mtd/mtd.py#L238
Thank you for your answer. I also want to ask if you have not done any LUCIR+MTD experiments in your experiments. There is only mtd_lucir_imagenet1000.yaml in the code.
We provided the results of LUCIR+MTD in Section 11.1 of supplementary material for the comparison with DT-CIL [1] which is an existing dual-teacher distillation method.
[1] Choi, Yoojin, Mostafa El-Khamy, and Jungwon Lee. "Dual-teacher class-incremental learning with data-free generative replay." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
Hello, could you please explain how your multi-teacher model method conducts experiments with PODNet and AFC? It seems that this is not clearly indicated in your code.