nerdyrodent / VQGAN-CLIP

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Other
2.59k stars 427 forks source link

Cannot reproduce results from README.md #90

Closed wprazuch closed 2 years ago

wprazuch commented 2 years ago

Hello, First of all, thank you for an amazing repository - this is both very spectacular and well-written. I am having a problem reproducing the results from the README.md file though. I was trying to reproduce the following input:

python generate.py -p "A painting of an apple in a fruit bowl"

However, at the output I am obtaining the following image:

output

Does anyone have the same issue or can guide me on what could be the reason for such discrepancies? Regards

nerdyrodent commented 2 years ago

The seed is one big factor, but then even with a seed and some determinism turned on you'll get differences.

wprazuch commented 2 years ago

@nerdyrodent Yes, I thought about that, however, I guess this case also has problems with preserving the global structure. I can imagine the photo to not be identical to the example in the README.md, but regardless how much I try, I cannot create an image, which has a nice, round apple-looking object, and a proper bowl.

Any idea what is the problem? 🤔

nerdyrodent commented 2 years ago

If you want a round apple and a proper bowl, one way is to use an init image. Outside of that, you'll be looking at prompt engineering and luck ;)

wprazuch commented 2 years ago

All good. Seems like my A100 SXM did not like the drivers I installed (on Ubuntu 20.04) and produced such weird-looking output. Updating the drivers significantly improved the quality of results (don't know why, and still did not reproduce the results though :P).

Interesting note: for the A100 SXM, the PyTorch version specified in the requirements had inference problems - an average iteration lasted 20 seconds. Raising the PyTorch version significantly sped up the inference (~10 times faster). This may be related to the GPU - SXM requires some additional packages for the inference to run fast and higher PyTorch versions include it and this one does not.