neuralmagic / sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Apache License 2.0
2.07k stars 148 forks source link

Fixing Multi-GPU Unit Test Issue #2302

Closed Satrat closed 5 months ago

Satrat commented 6 months ago

A sequential distillation test was failing with a device mismatch error when running on multiple GPUs. This was caused by the student model being forced to "cuda:0" while the teacher model was defaulting to "auto" device mapping. The fix is to initialize both models with device_map="auto", then transformers handles the multi-gpu case automatically