Closed balaabhijit closed 3 weeks ago
Hi @balaabhijit, thanks for reporting ! What would be the use case for this ? Are you modifying the model ? To answer your question, you can use remove the hooks by calling remove_hook_from_module(model, recurse=True)
, then you should be able to move the model to your chosen device. As for the warning, I will create a PR, so that we remove it when calling remove_hook_from_module
! LMK if this works !
@SunMarc Thanks for your prompt reply. This works!
The use case is a very specific case where we are trying to quantize the model as part of a pipeline. The previous step requires the model to be distributed and the quantization step (AWQ) happens layer by layer so as to save GPU memory. So I had to move the model back and forth from the device
Sounds good ! I'm closing this issue since this is solved !
System Info
Information
Tasks
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
)Reproduction
This PR #1790 seems to add a warning when trying to move a model which is dispatched on multiple devices. Is there a way to move the model back to CPU once we dispatch to multiple devices? I even tried saving the model, deleting and reloading the model but, deleting the model in python with
del model
(Even with GC and cuda cache clear) doesn't seem to clear GPU memory. Is there anyway to achieve this?Expected behavior
Move model to CPU after dispatching to multiple GPUs