Closed CrazyElements closed 3 months ago
if self.ortho_matrix is None or iter % self.update_proj_gap == group_idx:
Update: staggering probably doesn’t even affect the peak. I would assume that the peak(s) aren’t problematic as each parameter group is handled independently. So memory can be reused. Would be great to quantify that somehow.Hi @RobertBiehl, Understood what you said. Thank you for your response!
Impressive and insightful work, hooray to the authors! Recently I read your paper, but I'm comfused about the following parts.
Sorry if I ask stupid questions. Thank you for your time and consideration.