deep-floyd / IF

Other
7.64k stars 497 forks source link

Only work at demo's pic, if I use my picture, it releases a bug , AssertionError: #73

Open hellangleZ opened 1 year ago

hellangleZ commented 1 year ago

AssertionError Traceback (most recent call last) Cell In[24], line 4 1 count = 4 2 prompt = 'a boy' ----> 4 result = style_transfer( 5 t5=t5, if_I=if_I, if_II=if_II, if_III=if_III, 6 support_pil_img=zkc, 7 prompt=[prompt]*count, 8 style_prompt=[ 9 f'in style lego', 10 f'in style zombie', 11 f'in style origami', 12 f'in style anime', 13 ], 14 seed=42, 15 if_I_kwargs={ 16 "guidance_scale": 10.0, 17 "sample_timestep_respacing": "10,10,10,10,10,0,0,0,0,0", 18 'support_noise_less_qsample_steps': 5, 19 'positive_mixer': 0.8, 20 }, 21 if_II_kwargs={ 22 "guidance_scale": 4.0, 23 "sample_timestep_respacing": 'smart50', 24 "support_noise_less_qsample_steps": 5, 25 'positive_mixer': 1.0, 26 }, 27 ) 28 if_I.show(result['III'], 2, 14)

File ~/miniconda3/envs/if/lib/python3.10/site-packages/deepfloyd_if/pipelines/style_transfer.py:91, in style_transfer(t5, if_I, if_II, if_III, support_pil_img, style_prompt, prompt, negative_prompt, seed, if_I_kwargs, if_II_kwargs, if_III_kwargs, progress, return_tensors, disable_watermark) 87 if_II_kwargs['progress'] = progress 89 if_II_kwargs['support_noise'] = mid_res ---> 91 stageII_generations, _meta = if_II.embeddings_to_image(**if_II_kwargs) 92 pil_images_II = if_II.to_images(stageII_generations, disable_watermark=disable_watermark) 94 result['II'] = pil_images_II

File ~/miniconda3/envs/if/lib/python3.10/site-packages/deepfloyd_if/modules/stage_II.py:26, in IFStageII.embeddings_to_image(self, low_res, t5_embs, style_t5_embs, positive_t5_embs, negative_t5_embs, batch_repeat, aug_level, dynamic_thresholding_p, dynamic_thresholding_c, sample_loop, sample_timestep_respacing, guidance_scale, img_scale, positive_mixer, progress, seed, sample_fn, kwargs) 21 def embeddings_to_image( 22 self, low_res, t5_embs, style_t5_embs=None, positive_t5_embs=None, negative_t5_embs=None, batch_repeat=1, 23 aug_level=0.25, dynamic_thresholding_p=0.95, dynamic_thresholding_c=1.0, sample_loop='ddpm', 24 sample_timestep_respacing='smart50', guidance_scale=4.0, img_scale=4.0, positive_mixer=0.5, 25 progress=True, seed=None, sample_fn=None, kwargs): ---> 26 return super().embeddings_to_image( 27 t5_embs=t5_embs, 28 low_res=low_res, 29 style_t5_embs=style_t5_embs, 30 positive_t5_embs=positive_t5_embs, 31 negative_t5_embs=negative_t5_embs, 32 batch_repeat=batch_repeat, 33 aug_level=aug_level, 34 dynamic_thresholding_p=dynamic_thresholding_p, 35 dynamic_thresholding_c=dynamic_thresholding_c, 36 sample_loop=sample_loop, 37 sample_timestep_respacing=sample_timestep_respacing, 38 guidance_scale=guidance_scale, 39 positive_mixer=positive_mixer, 40 img_size=256, 41 img_scale=img_scale, 42 progress=progress, 43 seed=seed, 44 sample_fn=sample_fn, 45 **kwargs 46 )

File ~/miniconda3/envs/if/lib/python3.10/site-packages/deepfloyd_if/modules/base.py:181, in IFBaseModule.embeddings_to_image(self, t5_embs, low_res, style_t5_embs, positive_t5_embs, negative_t5_embs, batch_repeat, dynamic_thresholding_p, sample_loop, sample_timestep_respacing, dynamic_thresholding_c, guidance_scale, aug_level, positive_mixer, blur_sigma, img_size, img_scale, aspect_ratio, progress, seed, sample_fn, support_noise, support_noise_less_qsample_steps, inpainting_mask, **kwargs) 179 else: 180 assert support_noise_less_qsample_steps < len(diffusion.timestep_map) - 1 --> 181 assert support_noise.shape == (1, 3, image_h, image_w) 182 q_sample_steps = torch.tensor([int(len(diffusion.timestep_map) - 1 - support_noise_less_qsample_steps)]) 183 support_noise = support_noise.cpu()

hellangleZ commented 1 year ago

When I use office demo, like cat or other pic to lego , they still work good , but only when I change my personal pic to do the generate, it releases this bug

hellangleZ commented 1 year ago

Enviroment:

A100 Ubuntu:18.04

pip list Package Version Editable project location


accelerate 0.18.0 antlr4-python3-runtime 4.9.3 anyio 3.5.0 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asttokens 2.0.5 attrs 22.1.0 Babel 2.11.0 backcall 0.2.0 beautifulsoup4 4.12.2 bleach 4.1.0 brotlipy 0.7.0 cchardet 2.1.7 certifi 2022.12.7 cffi 1.15.1 chardet 5.1.0 charset-normalizer 3.1.0 clip 1.0 /aml/CLIP-main cmake 3.26.3 comm 0.1.2 contourpy 1.0.7 cryptography 39.0.1 cycler 0.11.0 debugpy 1.5.1 decorator 5.1.1 deepfloyd-if 1.0.1 defusedxml 0.7.1 diffusers 0.16.1 entrypoints 0.4 executing 0.8.3 fastjsonschema 2.16.2 filelock 3.12.0 fonttools 4.39.3 fsspec 2023.4.0 ftfy 6.1.1 huggingface-hub 0.14.1 idna 3.4 importlib-metadata 6.6.0 ipykernel 6.19.2 ipython 8.12.0 ipython-genutils 0.2.0 ipywidgets 8.0.4 jedi 0.18.1 Jinja2 3.1.2 json5 0.9.6 jsonschema 4.17.3 jupyter 1.0.0 jupyter_client 8.1.0 jupyter-console 6.6.3 jupyter_core 5.3.0 jupyter-server 1.23.4 jupyterlab 3.5.3 jupyterlab-pygments 0.1.2 jupyterlab_server 2.22.0 jupyterlab-widgets 3.0.5 kiwisolver 1.4.4 lit 16.0.2 lxml 4.9.2 MarkupSafe 2.1.2 matplotlib 3.7.1 matplotlib-inline 0.1.6 mistune 0.8.4 mpmath 1.3.0 mypy-extensions 1.0.0 nbclassic 0.5.5 nbclient 0.5.13 nbconvert 6.5.4 nbformat 5.7.0 nest-asyncio 1.5.6 networkx 3.1 notebook 6.5.4 notebook_shim 0.2.2 numpy 1.24.3 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 omegaconf 2.3.0 packaging 23.1 pandocfilters 1.5.0 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.5.0 pip 23.0.1 platformdirs 2.5.2 ply 3.11 prometheus-client 0.14.1 prompt-toolkit 3.0.36 protobuf 3.19.0 psutil 5.9.5 ptyprocess 0.7.0 pure-eval 0.2.2 pycparser 2.21 Pygments 2.11.2 pyOpenSSL 23.0.0 pyparsing 3.0.9 PyQt5-sip 12.11.0 pyre-extensions 0.0.29 pyrsistent 0.18.0 PySocks 1.7.1 python-dateutil 2.8.2 pytz 2022.7 PyYAML 6.0 pyzmq 25.0.2 qtconsole 5.4.2 QtPy 2.2.0 regex 2023.3.23 requests 2.29.0 safetensors 0.3.1 Send2Trash 1.8.0 sentencepiece 0.1.99 setuptools 66.0.0 sip 6.6.2 six 1.16.0 sniffio 1.2.0 soupsieve 2.4.1 stack-data 0.2.0 sympy 1.11.1 terminado 0.17.1 tinycss2 1.2.1 tokenizers 0.13.3 toml 0.10.2 tomli 2.0.1 torch 2.0.0+cu118 torchaudio 0.13.1 torchvision 0.14.1 tornado 6.2 tqdm 4.65.0 traitlets 5.7.1 transformers 4.28.1 triton 2.0.0 typing_extensions 4.5.0 typing-inspect 0.8.0 urllib3 1.26.15 wcwidth 0.2.6 webencodings 0.5.1 websocket-client 0.58.0 wheel 0.38.4 widgetsnbextension 4.0.5 xformers 0.0.19 zipp 3.15.0

klei22 commented 1 year ago

same here

darkman111a commented 1 year ago

I wonder if it has to do with image dimensions? It seems that the support_noise tensor has a different shape than expected.

klei22 commented 1 year ago

I thought so too. I left it running overnight on a custom image of the same dimensions and it worked!

Probably something dimensions related, though I tried a bunch of other things and running once more with the original dimensions just to make sure.

klei22 commented 1 year ago

Just confirmed that I got the same error after not resizing -- this is definitely something to do with the resizing... Trying again after resizing an image outside of the script, which I suppose works for now...

Will try to dive into the root cause if there are some dimensions...

Under pipelines/style_transfer.py there is an aspect ratio Under modules/base.py there is a _get_image_sizes function, and thinking putting some printf's here might yield some clues.

Wondering if there might be any other sections of the source code we might want to look into?

klei22 commented 1 year ago

Just to add that resizing the image beforehand also makes everything work, so it's not something with any of the python libraries (was a low chance but have encountered an issue once before). Seems that something to do with resizing/aspect ratio operations are the main suspects at the moment.