Closed mengxujing closed 6 months ago
I trained the entire network, and found that only on the first task did the loss curve converge with a flatter loss landscape, but with incremental steps, the loss curves for the new tasks were not fitted.
I trained the entire network, and found that only on the first task did the loss curve converge with a flatter loss landscape, but with incremental steps, the loss curves for the new tasks were not fitted.