pals-ttic / sjc

Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation (CVPR 2023)
https://pals.ttic.edu/p/score-jacobian-chaining
Other
500 stars 15 forks source link

Multi-face Janus issue #8

Closed vishalghor closed 1 year ago

vishalghor commented 1 year ago

Hi, This is great work. Have read your paper the results looked good. I tried to reproduce the obama caption as suggested in the readme but ended with the output having multiface janus issue. Is that expected? any pointers to fix the same?

duxiaodan commented 1 year ago

SJC can be quite sensitive to random seed for certrain prompts. All results we have shown are obtained using the default seed, which is 0. Also, version of packages that will cause stochasticity during training can also be the reason. For example, make sure you are using kornia==0.6.0

w-hc commented 1 year ago

@vishalghor let us know if u can reproduce it. There should be no problem.

vishalghor commented 1 year ago

@w-hc @duxiaodan thank you for your feedback. I'll try with seed as 0 and check for the package version.

vishalghor commented 1 year ago

@duxiaodan @w-hc I tried with seed 0 and checking the package version for irregularities but still encountered the janus issue I have attached the video. Here is the package list with versions: absl-py==1.3.0 aiohttp==3.8.3 aiosignal==1.3.1 antlr4-python3-runtime==4.8 async-timeout==4.0.2 attrs==22.1.0 av==9.2.0 cachetools==5.2.0 certifi==2022.9.24 charset-normalizer==2.1.1 click==8.1.3 clip @ git+https:///github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1 contourpy==1.0.6 cycler==0.11.0 easydict==1.10 einops==0.6.0 filelock==3.8.0 fonttools==4.38.0 frozenlist==1.3.3 fsspec==2022.11.0 ftfy==6.1.1 future==0.18.2 google-auth==2.15.0 google-auth-oauthlib==0.4.6 grpcio==1.51.1 huggingface-hub==0.11.1 idna==3.4 imageio==2.22.4 imageio-ffmpeg==0.4.7 importlib-metadata==5.1.0 kiwisolver==1.4.4 kornia==0.6.0 Markdown==3.4.1 MarkupSafe==2.1.1 matplotlib==3.6.2 multidict==6.0.3 numpy==1.23.5 oauthlib==3.2.2 omegaconf==2.1.1 packaging==21.3 Pillow==9.3.0 protobuf==3.20.3 psutil==5.9.4 pyasn1==0.4.8 pyasn1-modules==0.2.8 pydantic==1.10.2 pyDeprecate==0.3.1 pyparsing==3.0.9 python-dateutil==2.8.2 pytorch-lightning==1.4.2 PyYAML==6.0 regex==2022.10.31 requests==2.28.1 requests-oauthlib==1.3.1 rsa==4.9 six==1.16.0 tabulate==0.9.0 -e git+https://github.com/CompVis/taming-transformers.git@3ba01b241669f5ade541ce990f7650a3b8f65318#egg=taming_transformers tensorboard==2.11.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tokenizers==0.13.2 torch==1.13.0+cu116 torchaudio==0.13.0+cu116 torchmetrics==0.6.0 torchvision==0.14.0+cu116 tqdm==4.64.1 transformers==4.25.1 typing_extensions==4.4.0 urllib3==1.26.13 wcwidth==0.2.5 Werkzeug==2.2.2 yarl==1.8.2 zipp==3.11.0 https://user-images.githubusercontent.com/23045660/206251946-c89f7463-3c58-4630-9a3a-b0b959f20d3b.mp4

duxiaodan commented 1 year ago

Thank you for reporting it. I checked and I found one also needs an additional argument --var_red False for the Obama experiment. I've updated the README page too. Could you please give it a try and let me know if you are able to reproduce our results?

w-hc commented 1 year ago

@vishalghor sry we are also verifying other prompts. They are all reproducible; we just need to make sure this public repo matches with our internal one.

w-hc commented 1 year ago

I am closing it for now since we do believe it should work this time. Feel free to re-open if it doesn't. We are acutely aware of the limitation of this technology so far, and are working to improve it. We (mostly under Xiaodan's lead) tried all of our stuff with seed 0, and did not try seed tuning. Out of the prompts we have tried there's roughly a 25% success rate. Far from ideal but somewhat usable. Also there are some general guidelines that we will write up (e.g. first try the prompt on 2D, and only use those that work well there before going to 3D).