Closed liujiawei2333 closed 2 years ago
Sorry, I am not aware not this issue; probably it's because the train_loader would be re-initialized somehow here? Since the number of data needed for BN calibration is small, you might manually save some batches in the training epoch to avoid calling train_loader here?
Thank you for your answer! I also have a question about learning rate. I noticed in lines 30 through 32 of the https://github.com/facebookresearch/AttentiveNAS/blob/main/solver/lr_scheduler.py, you didn't use BigNAS 'learning rate strategy, which is cosine drop and 5% initial learning rate constant ending. Why is that? If self.last_epoch > self.warmup_iters
seems never to be satisfied, what is the significance of it?
To our observations, we found the typical SGD + cosine decay setting actually works quite well.
Hello! Thank you for your excellent work! I had a problem with increasing memory usage (not GPU memory) while training the supernet. I checked and found that it was caused by lines 62 through 68 of the https://github.com/facebookresearch/AttentiveNAS/blob/main/evaluate/attentive_nas_eval.py. If I delete this code, the memory usage stays flat.Have you encountered such problems?