Open zhangvia opened 7 months ago
Hi:
so i was curious that why is_sequential_cpu_offload = True when component has AlignDevicesHook? Shouldn't it be True only when the component device is CPU?
by "sequentiaal_cpu_offload" we are referring to the enable_sequential_cpu_offload
method that you can call on our pipelines
I don't think this is a bug, no?
i think the pr #6857 will add AlignDevicesHook to every model in the pipeline. so if i use that new feature and load lora weights simultaneously, enable_sequential_cpu_offload
will be called in load_lora_weights
method. I wanna know if my understanding is correct. thank you for explanantion
i think the pr #6857 will add AlignDevicesHook to every model in the pipeline. so if i use that new feature and load lora weights simultaneously,
enable_sequential_cpu_offload
will be called inload_lora_weights
method. I wanna know if my understanding is correct. thank you for explanantion
The PR addresses a different problem, not sure how it's related.
The PR addresses a different problem
yes, but that pr need AlignDevicesHook to prepare the model input before call model forward method. and if i call the load_lora_weights method. finally, the enable_sequential_cpu_offload() method will be called because of the AlignDevicesHook. the chain is like this
load_lora_weights -> load_lora_into_unet -> _pipeline.enable_sequential_cpu_offload()
the part of load_lora_into_unet code is below:
# In case the pipeline has been already offloaded to CPU - temporarily remove the hooks
# otherwise loading LoRA weights will lead to an error
is_model_cpu_offload, is_sequential_cpu_offload = cls._optionally_disable_offloading(_pipeline)
inject_adapter_in_model(lora_config, unet, adapter_name=adapter_name)
incompatible_keys = set_peft_model_state_dict(unet, state_dict, adapter_name)
if incompatible_keys is not None:
# check only for unexpected keys
unexpected_keys = getattr(incompatible_keys, "unexpected_keys", None)
if unexpected_keys:
logger.warning(
f"Loading adapter weights from state_dict led to unexpected keys not found in the model: "
f" {unexpected_keys}. "
)
# Offload back.
if is_model_cpu_offload:
_pipeline.enable_model_cpu_offload()
elif is_sequential_cpu_offload:
_pipeline.enable_sequential_cpu_offload()
# Unsafe code />
and the load_lora_into_unet
method will call cls._optionally_disable_offloading(_pipeline)
to decide whether to call enable_sequential_cpu_offload()
method
in _optionally_disable_offloading(pipeline)
method, the is_sequential_cpu_offload
will be set True because of the AlignDevicesHook in module. part of _optionally_disable_offloading
method code is below:
for _, component in _pipeline.components.items():
if isinstance(component, nn.Module) and hasattr(component, "_hf_hook"):
if not is_model_cpu_offload:
is_model_cpu_offload = isinstance(component._hf_hook, CpuOffload)
if not is_sequential_cpu_offload:
is_sequential_cpu_offload = isinstance(component._hf_hook, AlignDevicesHook)
logger.info(
"Accelerate hooks detected. Since you have called `load_lora_weights()`, the previous hooks will be first removed. Then the LoRA parameters will be loaded and the hooks will be applied again."
)
remove_hook_from_module(component, recurse=is_sequential_cpu_offload)
i'm not sure if i'm right. i just want to figure out why call load_lora_weights after add AlignDevicesHook to module will set every component's device to meta. and thank you for your patient explanation
i find that in pipeline-device-map-auto branch, the pr disable the enable_sequential_cpu_offload()
when use device_map=balanced
.but i'm still be a little confused why we need call enable_sequential_cpu_offload()
in load_lora_weights method when add AlignDevicesHook in models in pipeline
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Is this still a problem?
Is this still a problem?
given that the device-map=auto can only place the model to different gpus according to the model size. i add AlignDevicesHook to every model manually. but the load_lora_weights method will remove all AlignDevicesHook and never add hooks back which really confused me. i change the hook class name, so the load_lora_weights method will not remove the hook. and until now, my code runs as expected
but the load_lora_weights method will remove all AlignDevicesHook
That is only temporary. We add them back. See here:
if is_model_cpu_offload:
_pipeline.enable_model_cpu_offload()
elif is_sequential_cpu_offload:
_pipeline.enable_sequential_cpu_offload()
it just call the enable_model_cpu_offload() or enable_sequential_cpu_offload(). actually i never call the enable_model_cpu_offload() or enable_sequential_cpu_offload(), my AlignDevicesHook are added to model manually to place models to different gpus. and i think we can't assume that the enable_model_cpu_offload() or enable_sequential_cpu_offload() are called when some model in pipeline has AlignedDevicesHook
it just call the enable_model_cpu_offload() or enable_sequential_cpu_offload().
Those methods are responsible for placing the hooks.
Thing is we are not supposed to call any offloading related utilities manually when any component underlying a pipeline was initialized with "balanced" device_map. This should be sufficiently clear from the errors:
Thing is we are not supposed to call any offloading related utilities manually when any component underlying a pipeline was initialized with "balanced" device_map.
i agree with that. but in the previous version, _optionally_disable_offloading() method will return is_sequential_cpu_offload=True because of the AlignDevicesHook when using device_map, which will offload the model to cpu
but i need use more flexible device_map, so i'm still adding hook manually. thank you for your patience ! but i still think we are not supposed to set is_sequential_cpu_offload=True just when the model has AlignDevicesHook. maybe somthing like this?:
is_sequential_cpu_offload = component.device.type=='cpu' and (
isinstance(component._hf_hook, AlignDevicesHook)
or hasattr(component._hf_hook, "hooks")
and isinstance(component._hf_hook.hooks[0], AlignDevicesHook)
)
Feel free to open a PR and we can take it from there :-)
i've just created pr #8750 for this
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Describe the bug
i noticed that when i add
align_device_hook
to module in pipeline manually, thenload_lora_weights
function will enable the sequential cpu offload. so i dig deeper and find thatload_lora_weights
function use_optionally_disable_offloading
function to decide whether to sequentially cpu offload. this use_optionally_disable_offloading
function was:so i was curious that why
is_sequential_cpu_offload = True
when component has AlignDevicesHook? Shouldn't it be True only when the component device is CPU?Reproduction
and then
pipe.load_lora_weights(lora_weights_path)
will change all component deviceLogs
No response
System Info
diffusers:0.25.1 torch:2.2.0+cu118
Who can help?
No response