Closed azj-n closed 8 months ago
Hello, will you be adding the efficient Aligned-MTL-UB version where they use representation instead of gradient calculation? Thank you
Sure. If adding --rep_grad in the training command, it means computing the gradient w.r.t the shared representation instead of shared parameters.
--rep_grad
I see, thank you.
Hello, will you be adding the efficient Aligned-MTL-UB version where they use representation instead of gradient calculation? Thank you