Closed piEsposito closed 1 year ago
@patrickvonplaten I've created the follow up for #281 and can work on that if you let me.
@patrickvonplaten, the PR is open and ready for review. Thanks!
Hi, friend.I see this PR https://github.com/huggingface/diffusers/pull/361 but when i try this:
self.pipe = StableDiffusionPipeline.from_pretrained(
self.model_id_or_path,
revision="fp32",
device_map="auto",
torch_dtype=torch.float32,
scheduler = DDIMScheduler(
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear",
clip_sample=False,
set_alpha_to_one=False,
),
# use_auth_token=True,
)
self.pipe = self.pipe.to(self.device)
i get a error with:
set_alpha_to_one=False,
File "/root/.conda/envs/ai/lib/python3.7/site-packages/diffusers/pipeline_utils.py", line 517, in from_pretrained
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/root/.conda/envs/ai/lib/python3.7/site-packages/transformers/modeling_utils.py", line 2269, in from_pretrained
max_memory=max_memory,
File "/root/.conda/envs/ai/lib/python3.7/site-packages/accelerate/utils/modeling.py", line 480, in infer_auto_device_map
max_layer_size, max_layer_names = get_max_layer_size(modules_to_treat, module_sizes, no_split_module_classes)
File "/root/.conda/envs/ai/lib/python3.7/site-packages/accelerate/utils/modeling.py", line 261, in get_max_layer_size
modules_children = list(module.named_children())
AttributeError: 'Parameter' object has no attribute 'named_children'
can you help me?
@CrazyBoyM I saw that the safety checker is the offensor here. I'm not employed by nor affiliated to HF, but as I'm the one who submitted the idea and the PR, I will try to fix it later today and have, at least, a branch were you can use it before it is merged.
How does that sound?
@CrazyBoyM I saw that the safety checker is the offensor here. I'm not employed by nor affiliated to HF, but as I'm the one who submitted the idea and the PR, I will try to fix it later today and have, at least, a branch were you can use it before it is merged.
How does that sound?
Very Great !
Closed as per #772 and https://github.com/huggingface/accelerate/pull/747.
@CrazyBoyM, what happens is that, for that feature to work, we need a version of accelerate
with https://github.com/huggingface/accelerate/pull/747 merged. This PR was merged 4 days ago and the last release was 7 days ago.
Until they make the next release for accelerate, if you really want to use this feature I suggest you install accelerate from the master branch of the repository: pip install git+https://github.com/huggingface/accelerate.git
, which you can revert to the pypi version after the next release.
@CrazyBoyM, what happens is that, for that feature to work, we need a version of
accelerate
with huggingface/accelerate#747 merged. This PR was merged 4 days ago and the last release was 7 days ago.Until they make the next release for accelerate, if you really want to use this feature I suggest you install accelerate from the master branch of the repository:
pip install git+https://github.com/huggingface/accelerate.git
, which you can revert to the pypi version after the next release.
it works! very thanks. but sames no more VRAM was saves, may be it because I have use some other trick as these:
self.pipe = StableDiffusionPipeline.from_pretrained(
self.model_id_or_path,
device_map="auto",
# load_in_8bit=True,
low_cpu_mem_usage=True,
torch_dtype=torch.float16,
scheduler = DDIMScheduler(
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear",
clip_sample=False,
set_alpha_to_one=True,
),
# use_auth_token=True,
)
print(self.pipe.unet.conv_out.state_dict()["weight"].stride()) # (2880, 9, 3, 1)
self.pipe.unet.to(memory_format=torch.channels_last) # in-place operation
print(
self.pipe.unet.conv_out.state_dict()["weight"].stride()
) # (2880, 1, 960, 320) having a stride of 1 for the 2nd dimension proves that it works
self.pipe = self.pipe.to(self.device)
self.pipe.enable_attention_slicing()
on the T4, it cost about 4.2Gb VRAM. I will test it on my 1660ti 6g.
Hi, is there any update on this issue? I'd like to use device_map="auto" for the text-to-video model "damo-vilab/text-to-video-ms-1.7b", but it seems it doesn't work.
DiffusionPipeline.from_pretrained("damo-vilab/text-to-video-ms-1.7b", torch_dtype=torch.float16, variant="fp16")
https://huggingface.co/docs/diffusers/api/pipelines/text_to_video
Is your feature request related to a problem? Please describe. As a follow up for #281 we could add the device map and the possibility to load weights using
accelerate
toDiffusionPipeline
abstraction for smaller memory footprint when loading models.Describe the solution you'd like
Describe alternatives you've considered
Additional context This is a follow up for #281.
I can work on that if you folks would let me.