Closed willdzeng closed 1 year ago
NVM, I think I know why it doesn't need to freeze the backbone in the training function, it's because the training buffer is created with eval() mode and no_grad() before running the training, so when doing the actual training of MLP head, it doesn't have gradient in the encoder's parameters. Thanks, closing the issue
Hi,
I can't seem to find where does ACE freeze the backbone during training, since during test it use the same pre-trained encoder, during training it's probably better to freeze the backbone such that the backbone doesn't change, right?
or is this on purpose to not freeze the backbone?