Open MSunDYY opened 10 months ago
I‘ve known the reason, which is owing to the following code self.running_var = (1.-self.momentum)self.running_var + self.momentumvar_.view(-1)
the memory of ver_ is linked repeatedly to self.running_var without releasing, so the next question is why do you describe this?
So I add "with torch.no_grad():" before it and I consider that the grad of var can't need to be tracked, I'm looking forward to your reply,thanks!
cuda memory increased with the iteration increasing,2M per/iteration on average,I find that it's owing to the mask_bn layer,can you tell the reason?