App is not working fully yet?

SoftologyPro commented 1 year ago

Start app Click dog_audio.wav Click Submit Processes for a while and then shows <PIL.Image.Image image mode=RGB size=768x768 at 0x176753EF700> in the output No errors shown on command line, no image created.

Zeqiang-Lai commented 1 year ago

Sorry It is a bug of 1.0.4 and has been fixed at 1.0.5.

you could solve it via

pip install anything2image --upgrade

SoftologyPro commented 1 year ago

Still fails here. I setup a new venv then installed the latest 1.0.5.

Also, why are there 2 Textbox areas? Shouldn't there only be 1 text, 1 audio and 1 image and depending on what the user sets they are combined for the final image.

Zeqiang-Lai commented 1 year ago

did you still face the same issue？

The first text box act as a prompt, and can be leave as empty. The following box including the second text box act as additional condition, and one of them must be provided.

Therefore, for text to image, you have to input text to second box and leave the first one empty

SoftologyPro commented 1 year ago

Yes, same issue. I created a new python environment. Activated the environment.

pip install anything2image (which installed the latest 1.0.5) python -m anything2image.app

Error - No module named 'torchaudio'

So I install the GPU torch...

pip uninstall -y torch pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

Then restart the GUI python -m anything2image.app Click dog_audio.wav Click Submit Process runs for around 12 seconds and then the text <PIL.Image.Image image mode=RGB size=768x768 at 0x16928B9CCD0> is displayed in the output area. No image shown.

There is a tmp.wav created, but no images.

Zeqiang-Lai commented 1 year ago

Very sorry for the error. I guess it might be some error in pypi packages.

Could you try to clone the repo and install locally via

pip install .

to see if the error still exists.

PS：no need to setup a new env

SoftologyPro commented 1 year ago

OK, tried this (I create the venv just to keep it isolated and new)

git clone https://github.com/Zeqiang-Lai/Anything2Image cd Anything2Image python -m venv .venv .venv\scripts\activate pip install . python -m anything2image.app

ModuleNotFoundError: No module named 'torchaudio'

Install GPU torch again

pip uninstall -y torch pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

python -m anything2image.app

GUI starts Click dog_audio.wav Now I get a picture of a dog in the output area. Clicking submit runs the process again and generates a new dog image.

Manual git clone and pip install . works. pip install anything2image does not work.

Hope that helps you find the problem. Also one minor issue, show the GUI URL as 127.0.0.1 and not 0.0.0.0 as under Windows it won't open a 0.0.0.0 URL.

Zeqiang-Lai commented 1 year ago

Thanks @SoftologyPro. With your feedback, I update the UI and release the new versions. Hope this one could make it better.

SoftologyPro commented 1 year ago

That is so much better now, thank you. It works fine here. Could you add settings for image size, strength and noise like in this repository? Then it would be perfect. https://github.com/sail-sg/BindDiffusion

Zeqiang-Lai commented 1 year ago

The settings of image size, strength, and noise have been added at 1.1.0, which also introduces an option for scheduler.

SoftologyPro commented 1 year ago

Thank you. Strength is another useful setting too as that can influence how close to the initial image the generated images are. One minor thing, the default is/was 768x768 and not 512x512.

Zeqiang-Lai commented 1 year ago

Ops you are right. I mess it up with sd1.x. Thanks for that.

yangke13 commented 12 months ago

hi Zeqiang， Does this repo work with sd1.5?

Zeqiang-Lai commented 12 months ago

No, we rely on the SD Unclip to achieve the embedding magic, which enables the audio to image, etc.

Zeqiang-Lai / Anything2Image

App is not working fully yet? #2