Closed 1396066796 closed 4 months ago
Hi @1396066796
Is it your guess or did you confirm that training that uses AuxiliaryModelWrapper
does not detect it?
When using AuxiliaryModelWrapper
, you should use DataParallel
or DistributedDataParallel
inside the wrapper, thus the current implementation should be correct.
Feel free to reopen this issue if you confirm it is not the case.
P.S. Please follow the bug report template when you open an issue
https://github.com/yoshitomo-matsubara/torchdistill/blob/1fe3088e64fd39f23bc031720939a6dcbfec08a6/torchdistill/core/distillation.py#L418
When using distributed training (e.g. using DataParallel), the code uses isinstance to determine if the parent class of a strength is an AuxiliaryModelWrapper or not, which is an error, because the instance object that isinstance is judging is always DataParallel or DistDataParallel, which means that the judgement won't take effect.