sayakpaul / diffusers-torchao

End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).
Apache License 2.0
220 stars 7 forks source link

gradio error- for inference #34

Closed jadechoghari closed 2 weeks ago

jadechoghari commented 3 weeks ago

Hi @sayakpaul - awesome work for diffusers as always :), wanted to put it as a demo on gradio (ZeroGPU) in inference: https://huggingface.co/spaces/jadechoghari/flux-kiwi/tree/main but getting the following error:

For error logs: https://huggingface.co/spaces/jadechoghari/flux-kiwi/tree/main?logs=container

Env: inside requirements.txt: https://huggingface.co/spaces/jadechoghari/flux-kiwi/blob/main/requirements.txt

Thanks!

a-r-r-o-w commented 3 weeks ago

Unable to see the error logs but from the requirements, it looks like torchao is not installed from source and pytorch nightly isn't used. These are currently suggested as the expected env settings. Would be more helpful if you could paste the error logs here :)

jadechoghari commented 3 weeks ago

Sure! ===== Application Startup at 2024-09-14 22:43:59 =====

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling transformers.utils.move_cache().

0it [00:00, ?it/s] 0it [00:00, ?it/s] The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling transformers.utils.move_cache().

0it [00:00, ?it/s] 0it [00:00, ?it/s]

Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]You set add_prefix_space. The tokenizer needs to be converted from the slow tokenizers

Loading pipeline components...: 43%|████▎ | 3/7 [00:01<00:01, 2.93it/s] Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 2.59it/s]

Loading pipeline components...: 86%|████████▌ | 6/7 [00:06<00:01, 1.16s/it] Loading pipeline components...: 100%|██████████| 7/7 [00:07<00:00, 1.01s/it] ERR: subclass doesn't implement <method 'detach' of 'torch._C.TensorBase' objects> Traceback (most recent call last): File "/home/user/app/app.py", line 13, in pipe.transformer = autoquant( File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/torchao/quantization/autoquant.py", line 623, in autoquant _change_linears_to_autoquantizable( File "/usr/local/lib/python3.10/site-packages/torchao/quantization/autoquant.py", line 509, in _change_linears_to_autoquantizable _replace_with_custom_fn_if_matches_filter( File "/usr/local/lib/python3.10/site-packages/torchao/quantization/quant_api.py", line 202, in _replace_with_custom_fn_if_matches_filter new_child = _replace_with_custom_fn_if_matches_filter( File "/usr/local/lib/python3.10/site-packages/torchao/quantization/quant_api.py", line 202, in _replace_with_custom_fn_if_matches_filter new_child = _replace_with_custom_fn_if_matches_filter( File "/usr/local/lib/python3.10/site-packages/torchao/quantization/quant_api.py", line 202, in _replace_with_custom_fn_if_matches_filter new_child = _replace_with_custom_fn_if_matches_filter( [Previous line repeated 1 more time] File "/usr/local/lib/python3.10/site-packages/torchao/quantization/quant_api.py", line 198, in _replace_with_custom_fn_if_matches_filter model = replacement_fn(model) File "/usr/local/lib/python3.10/site-packages/torchao/quantization/quant_api.py", line 251, in insert_subclass lin.weight = torch.nn.Parameter( File "/usr/local/lib/python3.10/site-packages/torch/nn/parameter.py", line 43, in new t = data.detach().requiresgrad(requires_grad) AttributeError: 'NoneType' object has no attribute 'requiresgrad'

a-r-r-o-w commented 3 weeks ago

Ah yes, this is a known error and was fixed here. You would also need to install accelerate from source until the release

Edit: Sorry might not be the same error (I misread the one you pasted). In the case that installing from source does not help, I'll try taking a deeper look soon unless Sayak's here before me

jadechoghari commented 3 weeks ago

thanks @a-r-r-o-w - yes same error :(

sayakpaul commented 3 weeks ago

Can you try with torch nightly? Also, torchao installed from source. All our experiments are from torch nightly which we make it clear from our README.

a-r-r-o-w commented 3 weeks ago

Okay, I was able to replicate this on a colab notebook. I'm not up-to-date on the torchao changes, but something definitely seems to be broken due to latest changes. I've not pinned down the exact cause yet.

To fix, I installed torchao from a version I know that works. Could you try this:

USE_CPP=0 pip install git+https://github.com/pytorch/ao@cfabc13e72fd03934e62a2a03903bc1678235bed

And ofcourse, please use torch nightly alongside this

jadechoghari commented 3 weeks ago

thanks both! i updated the requirements.txt: https://huggingface.co/spaces/jadechoghari/flux-kiwi/blob/main/requirements.txt Still same issue. It's ok, could be something with gradio env i guess...

sayakpaul commented 3 weeks ago

So, you don't see the issue when using it without gradio?

a-r-r-o-w commented 3 weeks ago

Hmm, not sure why you would be getting the same error. Here's my Colab notebook using the same torchao commit which fully runs on Colab without errors or OOMs: https://colab.research.google.com/drive/1DUffhcjrU-uz7_cpuJO3E_D4BaJT7OPa?usp=sharing

jadechoghari commented 3 weeks ago

@a-r-r-o-w Yup, the notebook is reproducible! It's mainly for Flux when we deploy it on HF Spaces.

Just to confirm, does the requirements.txt look good?

diffusers
torch --pre -f https://download.pytorch.org/whl/nightly/cu122/torch_nightly.html
gradio==3.35.2
torchao @ git+https://github.com/pytorch/ao@cfabc13e72fd03934e62a2a03903bc1678235bed
transformers
sentencepiece
git+https://github.com/huggingface/accelerate
optimum-quanto
peft

@sayakpaul the issue comes up when we deploy on HF Spaces — I'll dig deeper to figure out why. If I find a fix, I’ll let you know! didn't test the script on Colab yet since Flux is pretty large.

sayakpaul commented 2 weeks ago

Are you using zeroGPU? Could it be because of that?

jadechoghari commented 2 weeks ago

Yes, I believe that could be the * issue. @sayakpaul, do you think it deserves to be investigated further or should we close this issue?

sayakpaul commented 2 weeks ago

I think this could be reported to the Zero GPU maintainers in isolation. Closing the issue hence.