Open lt573152873 opened 1 year ago
during training, Occu_mask whether there is gradient return? when occu_mask is zeros, the loss is small;
if need occu_mask -> occu_mask.detach() ?
during training, Occu_mask whether there is gradient return? when occu_mask is zeros, the loss is small;
if need occu_mask -> occu_mask.detach() ?