GUIDE: Using GLIDE in pipenv instead of pip

Arcitec commented 2 years ago

I've created a guide for people who have moved on to the more advanced pipenv instead of basic pip.

Install Pipenv and Pyenv on your system so that both tools are available. Pyenv is optional but is required for fetching the correct version of Python on-demand. But if you're reading this, you're probably a chad that has both tools already!
Create a new, empty project folder. Then navigate to it in a terminal window.
Run the following command to initialize a pipenv virtualenv with Python 3.9 (that's, as of this writing, the newest version that PyTorch supports).

pipenv install --python=3.9

Now you need to choose command based on what GPU you have. Note that the command will take a VERY long time to install, because it will download multiple versions of the 1.7 GB large PyTorch archive (for me it took 30 minutes on a 100mbit connection, and downloaded 16 GB). This is PyTorch's fault for using a non-standard repository format and a non-standard package versioning system (the "+cu113" junk), which means that Pipenv has trouble figuring out what to do (since Pipenv only follows Python PEP standards for how repositories should look), so it grabs everything that matches the query. Which is all architectures... (If you're on Linux, just check du -h ~/.cache/pipenv and you'll see that it's downloading gigabytes of packages...)

If you use an Ampere NVIDIA GPU (RTX 30x0 series or newer), or your CUDA version is 11.x or newer (check with nvidia-smi in a terminal) then use this repository (which uses CUDA Toolkit 11.3), since those modern GPUs require that applications be based on CUDA Toolkit 11.x.

pipenv install --extra-index-url https://download.pytorch.org/whl/cu113/ "torch==1.10.1+cu113"

If you use an older NVIDIA GPU (before Ampere), you may have to use this repository instead (which uses CUDA Toolkit 10.2):

pipenv install --extra-index-url https://download.pytorch.org/whl/ "torch==1.10.1+cu102"

When these versions are updated by PyTorch in the future, you'll have to open the repo URLs in a web browser and see if a newer version is available, and then use its exact name including the +cu### part... To find out the latest CUDA toolkits they're offering support for, go to https://pytorch.org/ and look at the CUDA versions they offer, and then modify the repo URLs above accordingly (/whl/ is the standard "old/stable" CUDA Toolkit, and /whl/cu113/ is the currently newest version 11.3 but it will change later). I really wish that PyTorch could properly name their packages and set up proper repos instead, but they don't, so we're stuck with this solution of specifying exact versions and repo paths. If we don't include +cu### in the version, then pipenv downloads older CUDA toolkit versions of the packages instead, so beware and be careful.
Also note that if you ever see a PyTorch repo URL that ends in <something>.html then you need to delete that HTML filename because that's PyTorch's oldest, pip-only repo format which is a HTML document that mashes together all versions of the packages (CUDA, CPU-only, etc) which makes it completely impossible for Pipenv to even figure out which architecture to download... PyTorch implemented the newer PEP 503 indexes, but only on URLs that don't point at any HTML page..
If someone doesn't have a NVIDIA CUDA GPU, there are versions with +cpu in the name in the /whl/ repository, such as:

pipenv install --extra-index-url https://download.pytorch.org/whl/ "torch==1.10.1+cpu"

You also need to install Numpy and PyYAML.

pipenv install numpy pyyaml

Next, clone the GLIDE library repo into a subfolder:

git clone https://github.com/openai/glide-text2im.git lib/glide-text2im

Tell Pipenv to install the "local library" (GLIDE). This will automatically detect your Pipfile in the parent folder and will add it to your Pipfile too. Note that this command must be run from the directory that the Pipfile is in, because it will treat the -e <path> as a relative path from the current working dir. I suppose you could provide a full, absolute path. But we'll do relative here. Oh and this command takes a while since it downloads dependencies.

pipenv install -e ./lib/glide-text2im

Create a subfolder for your own Python project files:

mkdir src && cd src

Now simply create your Python files in src/, "import" the library as seen in GLIDE's examples, and have fun. You're able to use the code from the Notebook example files that GLIDE has provided.
Running your code in the pipenv (virtualenv) must be done with a special command, so that it loads the Python version and virtualenv libraries that you've installed:

pipenv run python <yoursourcefilehere.py>

Arcitec commented 2 years ago

Small bonus guide: Converting the `*.ipynb` "Notebook" files to normal scripts.

Install the necessary tools for the conversion. These have a lot of dependencies and take a few minutes.

pipenv install jupyter nbconvert

Convert all notebooks to Python files. This must be executed from the top directory of your project (because if you run it inside the cloned git repo, it will treat it as a different project folder and would create another Pipfile instead):

cd ..  # return to parent folder (if you're still in the src/ folder)
pipenv run jupyter nbconvert --to script lib/glide-text2im/notebooks/*.ipynb

Now you can move those .py files out of lib/glide-text2im/notebooks/ and into your src/ folder as a basis for your own project.

mv lib/glide-text2im/notebooks/*.py src/

You have to edit the demos to remove the "IPython" stuff such as the "get_ipython" call and the image-display code and instead use something like OpenCV's image displayer to show the result, because the example code outputs the result to your "Notebook" (Jupyter etc). First install OpenCV2 instead.

pipenv install opencv-contrib-python

Edit the demos to remove these lines:

get_ipython().system('pip install git+https://github.com/openai/glide-text2im')
from PIL import Image
from IPython.display import display

Add this line:

import cv2

Replace this line:

Old:

    display(Image.fromarray(reshaped.numpy()))

New:

    # Resize to 4x larger and display with OpenCV2
    img = reshaped.numpy()
    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)  # Convert to OpenCV's BGR color order.
    img = cv2.resize(img, None, fx=4, fy=4, interpolation=cv2.INTER_LANCZOS4)
    cv2.imshow("Result", img)
    cv2.waitKey(0)  # Necessary for OS threading/rendering of the GUI.
    cv2.destroyAllWindows()

Very important: When you see a generated image, you must press a keyboard key to close the window. Don't close it with the "X" with your mouse because that will hang python on "waitKey" since it will wait for a key that never arrives. Displaying images with cv2 is impossible without waitKey since the OS thinks the window is dead if you skip that. So your only option for this demo is to close the windows by pressing a keyboard key such as space!

PS: "GLIDE (filtered)" is definitely a fun toy, but the results are pretty bad, blurry and nonsensical (unrelated to what you wrote) with the public model unfortunately, as mentioned here:

https://github.com/openai/glide-text2im/issues/21#issuecomment-1045590329

Most of the output you're gonna get is useless. But some of it can be fun for inspiration/ideas for projects or art. The main benefit of this model is actually that it generates results extremely fast compared to previous CLIP-based generators.

I would honestly say that the old CLIP-based generators that are out there are much better and more usable. Sure, the coherence of the image itself and the objects is better in GLIDE (filtered), but it responds really poorly to your input most of the time.

If you decide that you want to use this project anyway ("GLIDE (filtered)"), I recommend the clip_guided code. It's better than text2im at understanding things with the limited free training data we've been given. See this topic: https://github.com/openai/glide-text2im/issues/19

The main issue with the free version of GLIDE is that the filtered training data seems to have been mostly "freaking dogs!!". Which may explain why the default prompt demo is "an oil painting of a corgi"... It also produces extremely blurry output.

Arcitec commented 2 years ago

Bonus: If someone wants a full, more detailed guide about installing PyTorch in Pipenv correctly, then you can find that guide here:

https://github.com/pypa/pipenv/issues/4961#issuecomment-1045679643

All relevant commands and most of the explanations from that guide are already here in this GLIDE guide, but if you want a deeper understanding of how Pipenv's 3rd party repo support works compared to Pip, you'll want to check out that guide too.

openai / glide-text2im