Open joetm opened 2 years ago
What exactly is the difference between this and the Latent Diffusion here (https://github.com/CompVis/latent-diffusion) with the same readme?
I also had this issue, after downgrading torchmetrics to 0.6.0
(see https://github.com/NVIDIA/DeepLearningExamples/issues/1113) and applying the patch from #4
I get an ImportError
:
ImportError: cannot import name 'CLIPTokenizer' from 'transformers' (unknown location)
Edit:
Upgrading transformers
to 4.20.1
fixed the issue, but then there's an issue with openssl
. I copied pycrypto.so
and libssl.so.3
from another conda env I had but this is a bandaid fix.
You can easily fix this by adding the large text2img to the params:
python scripts/txt2img.py \
--prompt "a virus monster is playing guitar, oil on canvas" \
--config configs/latent-diffusion/txt2img-1p4B-eval.yaml \
--ckpt models/ldm/text2img-large/model.ckpt \
--ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0 --ddim_steps 50
You can easily fix this by adding the large text2img to the params:
python scripts/txt2img.py \ --prompt "a virus monster is playing guitar, oil on canvas" \ --config configs/latent-diffusion/txt2img-1p4B-eval.yaml \ --ckpt models/ldm/text2img-large/model.ckpt \ --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0 --ddim_steps 50
same issue from the box, solution worked for me
You can easily fix this by adding the large text2img to the params:
python scripts/txt2img.py \ --prompt "a virus monster is playing guitar, oil on canvas" \ --config configs/latent-diffusion/txt2img-1p4B-eval.yaml \ --ckpt models/ldm/text2img-large/model.ckpt \ --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0 --ddim_steps 50
This doesn't return same results though. You'd immediately notice the quality has decreased. I don't think it's the same model here.
Missing logs/f8-kl-clip-encoder-256x256-run1/configs/2022-06-01T22-11-40-project.yaml