naver-ai / Visual-Style-Prompting

Official Pytorch implementation of "Visual Style Prompting with Swapping Self-Attention"
https://curryjung.github.io/VisualStylePrompt/
Apache License 2.0
392 stars 30 forks source link

The HuggingFace Demo has been ignoring the images I've uploaded #1

Open hben35096 opened 4 months ago

hben35096 commented 4 months ago

https://huggingface.co/spaces/naver-ai/VisualStylePrompting PixPin_2024-03-15_03-43-39 I tried 6 times.

screan commented 4 months ago

same, wont take an uploaded image.

SoftologyPro commented 4 months ago

Yes, please add the functionality for the user to specify their own style image without having to modify config files. Should be simple, pick image, type prompt, generate. The first thing users want after running the examples is "that's cool, how can I use my own style image now"?

Joyofmovement commented 4 months ago

Please consider making it possible for users to be able to use their own images as a style, and it be simple to do so, many thanks. I really like this concept though, it's great. Thanks for your contribution.

taki0112 commented 3 months ago

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo.

We will update the demo code to support this by utilizing BLIP2.

SoftologyPro commented 3 months ago

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo.

We will update the demo code to support this by utilizing BLIP2.

That would work. User picks one of their images, BLIP2 captions it, user should get an option to modify the detected caption if need be, then the user image can be used to style any other image.

dhmiller123 commented 3 months ago

This will be very helpful. Thank you. Looking forward to working with my own images.

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo.

We will update the demo code to support this by utilizing BLIP2.

taki0112 commented 3 months ago
dhmiller123 commented 3 months ago

Thank you very much for this update. I will give this a try when I return from traveling. Best wishes for continued enhancement of your excellent project. Regards, DM

On Tue, Mar 26, 2024 at 8:10 AM Junho Kim @.***> wrote:

  • There is an issue about HF gpu, so HF is currently fixing it.
  • For this reason, the features for user image styles have been implemented, but not executed in the demo.
  • In now, Try vsp_real_script.py

— Reply to this email directly, view it on GitHub https://github.com/naver-ai/Visual-Style-Prompting/issues/1#issuecomment-2020391588, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZCQOIZYTUSB6XE4F55ISO3Y2FXTDAVCNFSM6AAAAABEWXVWEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRQGM4TCNJYHA . You are receiving this because you commented.Message ID: @.***>

SoftologyPro commented 3 months ago
  • There is an issue about HF gpu, so HF is currently fixing it.

    • For this reason, the features for user image styles have been implemented, but not executed in the demo.

    • In now, Try vsp_real_script.py

Can you make an updated app.py for local running? I am trying to do this all local on Windows, so it doesn't matter if it does not run as a HF online demo.

dhmiller123 commented 3 months ago

User images at HF still not working. Can you make an updated app.py for local running for SoftologyPro?

On Tue, Mar 26, 2024 at 9:10 AM Junho Kim @.***> wrote:

  • There is an issue about HF gpu, so HF is currently fixing it.
  • For this reason, the features for user image styles have been implemented, but not executed in the demo.
  • In now, Try vsp_real_script.py

— Reply to this email directly, view it on GitHub https://github.com/naver-ai/Visual-Style-Prompting/issues/1#issuecomment-2020391588, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZCQOIZYTUSB6XE4F55ISO3Y2FXTDAVCNFSM6AAAAABEWXVWEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRQGM4TCNJYHA . You are receiving this because you commented.Message ID: @.***>

taki0112 commented 3 months ago

@dhmiller123 @SoftologyPro In local, you can try with vsp_real_script.py

SoftologyPro commented 3 months ago

@dhmiller123 @SoftologyPro In local, you can try with vsp_real_script.py

I understand, but if you updated the gradio UI with that functionality it would make it easier for all users.

taki0112 commented 3 months ago

We have recently updated the demo to reflect user images. However, due to an issue with the GPU provided by Hugging Face (HF), the functionality is not performing as expected. We have no choice but to wait until HF resolves this issue.

SoftologyPro commented 3 months ago

OK, I understand that too. But, I don't want to run via huggingface. I want to run your gradio demo locally under Windows. If you do have a version of the gradio app.py that works locally then please do share. The only version of app.py I have is from before which has now been removed from your repo.

SoftologyPro commented 3 months ago

ie the attached version app.py (renamed app.txt as py files do not seem to be attachable). Running locally. That should get around any huggingface limitations?

app.txt

Screenshot 2024-04-01 183632
taki0112 commented 3 months ago

demo is working now.

dhmiller123 commented 3 months ago

I still get the same GPU error when I try to use my own image. What exactly is working?

On Tue, Apr 2, 2024 at 9:22 AM Junho Kim @.***> wrote:

demo is working now.

— Reply to this email directly, view it on GitHub https://github.com/naver-ai/Visual-Style-Prompting/issues/1#issuecomment-2032030151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZCQOI3ZITOLT5P3PAUEK6DY3KWJDAVCNFSM6AAAAABEWXVWEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZSGAZTAMJVGE . You are receiving this because you were mentioned.Message ID: @.***>

SoftologyPro commented 3 months ago

OK, when I try and run the HF demo with my own style image I get GPU timeouts. Can you provide a working version of app.py to run local? This is what I tried...

git clone https://huggingface.co/spaces/naver-ai/VisualStylePrompting In app.py I had to remark the first line import spaces and the other @spaces.GPU line. Then running app.py opens the UI

I select my own style image, set a prompt, set the outputs to 1 and click Submit. Gives these errors (same as the other issue I raised wiith vsp_real_script.py) https://github.com/naver-ai/Visual-Style-Prompting/issues/7

Traceback (most recent call last):
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\queueing.py", line 501, in call_prediction
    output = await route_utils.call_process_api(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\route_utils.py", line 253, in call_process_api
    output = await app.get_blocks().process_api(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\blocks.py", line 1695, in process_api
    result = await self.call_function(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\blocks.py", line 1235, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\anyio\_backends\_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\utils.py", line 692, in wrapper
    response = f(*args, **kwargs)
  File "<path to local clone>Visual Style Prompting\app.py", line 156, in style_fn
    ref_prompt = blip_inf_prompt(origin_real_img)
  File "<path to local clone>Visual Style Prompting\app.py", line 77, in blip_inf_prompt
    generated_ids = blip_model.generate(**inputs)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\transformers\models\blip_2\modeling_blip_2.py", line 1830, in generate
    outputs = self.language_model.generate(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\transformers\generation\utils.py", line 1466, in generate
    self._validate_generated_length(generation_config, input_ids_length, has_default_max_length)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\transformers\generation\utils.py", line 1186, in _validate_generated_length
    raise ValueError(
ValueError: Input length of input_ids is 0, but `max_length` is set to -13. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`.

If I then click the watercolor horse/tiger example and click Submit it works.

If I then select my own style image again and click Submit it does not crash, but still uses the previous watercolor style and ignores my style image.

Screenshot 2024-04-03 095343
SoftologyPro commented 3 months ago

OK, for those wanting to run this locally, I finally got it working after trying various package versions until these worked.

python -m pip install --upgrade pip
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts wheel==0.41.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts diffusers==0.27.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts accelerate==0.28.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts einops==0.7.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts kornia==0.7.2
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts gradio==4.25.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts transformers==4.39.3
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts opencv-python==4.9.0.80
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts xformers==0.0.25 --index-url https://download.pytorch.org/whl/cu118
pip uninstall -y torch
pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==2.2.1+cu118 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

https://softologyblog.wordpress.com/2023/10/10/a-plea-to-all-python-developers/

SoftologyPro commented 3 months ago

Running locally under Windows 11 on a 4090.

Screenshot 2024-04-03 190125
SoftologyPro commented 3 months ago

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo. We will update the demo code to support this by utilizing BLIP2.

I think BLIP may also struggle to write an effective description too? Would it help to show the detected caption and allow the user to edit it before use? When an example is clicked, show the caption used for those too.

Here are some "failed" results that may help to have a better caption text for the style images?

Do you think these results are due to the caption or just a bad style image choice?

The broccoli image was BLIP captioned "broccoli is a vegetable that is very popular". Would a better prompt help get a better styled result? Maybe just broccoli.

The wave image was captioned. "a large wave breaking on the ocean"

Those 2 and the tiger above are not as "clean" as the example results. For the tiger above I expected textures that matched the style image. Would a better caption help there?

Screenshot 2024-04-04 110536 Screenshot 2024-04-04 110849