HobbitLong / RepDistiller

[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods
BSD 2-Clause "Simplified" License
2.11k stars 389 forks source link

Multiple GPU training #28

Closed deropty closed 3 years ago

deropty commented 3 years ago

Hi, I'm a freshman in the context of knowledge distillation. I wonder why there is no mutlple-gpu training in your code. What is the reason and is there any solution with this question? I'm very appereciate for any response, thank you!