Open Arcitec opened 2 years ago
*.ipynb
"Notebook" files to normal scripts.pipenv install jupyter nbconvert
cd .. # return to parent folder (if you're still in the src/ folder)
pipenv run jupyter nbconvert --to script lib/glide-text2im/notebooks/*.ipynb
lib/glide-text2im/notebooks/
and into your src/
folder as a basis for your own project.mv lib/glide-text2im/notebooks/*.py src/
pipenv install opencv-contrib-python
get_ipython().system('pip install git+https://github.com/openai/glide-text2im')
from PIL import Image
from IPython.display import display
import cv2
Old:
display(Image.fromarray(reshaped.numpy()))
New:
# Resize to 4x larger and display with OpenCV2
img = reshaped.numpy()
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR) # Convert to OpenCV's BGR color order.
img = cv2.resize(img, None, fx=4, fy=4, interpolation=cv2.INTER_LANCZOS4)
cv2.imshow("Result", img)
cv2.waitKey(0) # Necessary for OS threading/rendering of the GUI.
cv2.destroyAllWindows()
PS: "GLIDE (filtered)" is definitely a fun toy, but the results are pretty bad, blurry and nonsensical (unrelated to what you wrote) with the public model unfortunately, as mentioned here:
https://github.com/openai/glide-text2im/issues/21#issuecomment-1045590329
Most of the output you're gonna get is useless. But some of it can be fun for inspiration/ideas for projects or art. The main benefit of this model is actually that it generates results extremely fast compared to previous CLIP-based generators.
I would honestly say that the old CLIP-based generators that are out there are much better and more usable. Sure, the coherence of the image itself and the objects is better in GLIDE (filtered), but it responds really poorly to your input most of the time.
If you decide that you want to use this project anyway ("GLIDE (filtered)"), I recommend the clip_guided
code. It's better than text2im
at understanding things with the limited free training data we've been given. See this topic: https://github.com/openai/glide-text2im/issues/19
The main issue with the free version of GLIDE is that the filtered training data seems to have been mostly "freaking dogs!!". Which may explain why the default prompt demo is "an oil painting of a corgi"... It also produces extremely blurry output.
https://github.com/pypa/pipenv/issues/4961#issuecomment-1045679643
All relevant commands and most of the explanations from that guide are already here in this GLIDE guide, but if you want a deeper understanding of how Pipenv's 3rd party repo support works compared to Pip, you'll want to check out that guide too.
I've created a guide for people who have moved on to the more advanced pipenv instead of basic pip.
Install Pipenv and Pyenv on your system so that both tools are available. Pyenv is optional but is required for fetching the correct version of Python on-demand. But if you're reading this, you're probably a chad that has both tools already!
Create a new, empty project folder. Then navigate to it in a terminal window.
Run the following command to initialize a pipenv virtualenv with Python 3.9 (that's, as of this writing, the newest version that PyTorch supports).
du -h ~/.cache/pipenv
and you'll see that it's downloading gigabytes of packages...)nvidia-smi
in a terminal) then use this repository (which uses CUDA Toolkit 11.3), since those modern GPUs require that applications be based on CUDA Toolkit 11.x.When these versions are updated by PyTorch in the future, you'll have to open the repo URLs in a web browser and see if a newer version is available, and then use its exact name including the
+cu###
part... To find out the latest CUDA toolkits they're offering support for, go to https://pytorch.org/ and look at the CUDA versions they offer, and then modify the repo URLs above accordingly (/whl/
is the standard "old/stable" CUDA Toolkit, and/whl/cu113/
is the currently newest version 11.3 but it will change later). I really wish that PyTorch could properly name their packages and set up proper repos instead, but they don't, so we're stuck with this solution of specifying exact versions and repo paths. If we don't include+cu###
in the version, then pipenv downloads older CUDA toolkit versions of the packages instead, so beware and be careful.Also note that if you ever see a PyTorch repo URL that ends in
<something>.html
then you need to delete that HTML filename because that's PyTorch's oldest, pip-only repo format which is a HTML document that mashes together all versions of the packages (CUDA, CPU-only, etc) which makes it completely impossible for Pipenv to even figure out which architecture to download... PyTorch implemented the newer PEP 503 indexes, but only on URLs that don't point at any HTML page..If someone doesn't have a NVIDIA CUDA GPU, there are versions with
+cpu
in the name in the/whl/
repository, such as:-e <path>
as a relative path from the current working dir. I suppose you could provide a full, absolute path. But we'll do relative here. Oh and this command takes a while since it downloads dependencies.Now simply create your Python files in
src/
, "import" the library as seen in GLIDE's examples, and have fun. You're able to use the code from the Notebook example files that GLIDE has provided.Running your code in the pipenv (virtualenv) must be done with a special command, so that it loads the Python version and virtualenv libraries that you've installed: