PLE 的 bug - Githubissues

当使用 aliccp MTL example 运行 PLE 中存在以下问题

        for ple_out, tower, predict_layer in zip(ple_outs, self.towers, self.predict_layers):
            tower_out = tower(ple_out)  #[batch_size, 1]

但是tower_out 的形状是 [batch_size, 8]。

所以我认为是不是在PLE 模型初始化中

        self.towers = nn.ModuleList(
            MLP(expert_params["dims"][-1], output_layer=False, **tower_params_list[i]) for i in range(self.n_task))

改为

        self.towers = nn.ModuleList(
            MLP(expert_params["dims"][-1], output_layer=True, **tower_params_list[i]) for i in range(self.n_task))

@morningsky 欢迎指正

datawhalechina / torch-rechub

PLE 的 bug #52