MountaintopLotus / braintrust

A Dockerized platform for running Stable Diffusion, on AWS (for now)
Apache License 2.0
1 stars 2 forks source link

Training Verses models #28

Open JohnTigue opened 1 year ago

JohnTigue commented 1 year ago

The low-hanging fruit in these AI Art tools is 2D images, with 3D meshes being the top of the tree and most desirable. But even the 2D images can be used for in-game assets now via, say, generating textures for use in #7.

Either way, 2D or 3D, the real pay-out is going to be via custom trained (or fine-tuned) models. For example, there will be a model of Al-Suya.

Issues involving model training:

JohnTigue commented 1 year ago

Here's a good, 84 minute tutorial on YouTube: Diffusion Models - Live Coding Tutorial by Damian Bogunowicz.

JohnTigue commented 1 year ago

Reddit:

Flexible-Diffusion. My first experiment with finetuning. A broad model with better general aesthetics and coherence for different styles! Scroll for 1.5 vs FlexibleDiffusion grids. (BTW, PublicPrompts.art is back!!!)

JohnTigue commented 1 year ago

Textual Inversion under 4 minutes?!:

Fine tuning stable diffusion is not really something new. We've got textual inversion. We got DreamBooth, and now we got the new kit on the block called LoRA.

JohnTigue commented 1 year ago

Seems there is a brand new technique, LEAP Booster, which essentially turns textual inversion into style transfer.

LEAP on Reddit:

Hey everyone, I made the model. It's currently in beta testing on my Discord bot as well as a standalone script.

I didn't expect this to take up so fast! Currently my biggest priority is:

  • Making a 1-click Colab demonstration
  • Creating a new video explaining in-depth how it works
  • Releasing training code + dataset (this is one big peculiarity, and really messy code after 6 months of research, so it needs some cleaning up) -Try this out with more cool stuff i.e LoRa
JohnTigue commented 1 year ago

TheLastBen Fast Dreambooth mini tutorial:

If you want to have a person's face in SD, all you need is 5-7 decent pics and TheLastBen Colab

JohnTigue commented 1 year ago

Instructions for the training dataset for a FACE, not a full character: AI Headshot Generator

Selfie Requirements:

  • 10-30 images of normal face (e.g., no filters, no masks, no winking, no funny faces).
  • Clear photos with good lighting, especially on face (e.g., no shadows).
  • No full-body shots (makes our AI struggle to learn faces).
  • Resolution of 1024x1024 or higher.
  • Only photos of yourself.
  • Diversity in: (a) clothes; (b) background; (c) face angles (e.g., front view, left side, right side); (d) face size (e.g. face close to camera, face far away).
JohnTigue commented 1 year ago

Really good article, more like experiment logs: Stable Diffusion Fine-tuning Experiments with ED2.0 (Part 1). Using 20 images of an individual, this time where they also generate short textual descriptions of the images ("labels" for the training data; giving text AND image enables better mapping to base model).

JohnTigue commented 1 year ago

Getting the hang of Dreambooth training. Details in comments.

JohnTigue commented 1 year ago

Another example of an impressive character model: Textual embedding likeness training process and weights.

JohnTigue commented 1 year ago

Seems like train for faces (a face) is a specific task: Ultimate Free Textual Inversion In Stable Diffusion! Your Face Inside All Models!.

JohnTigue commented 1 year ago

This guy compares all the existing ways to do SD training, two weeks ago: LoRA vs Dreambooth vs Textual Inversion vs Hypernetworks.

JohnTigue commented 1 year ago

Guy in above video says DreamBooth is seemingly the best (in@6m15s). Only downside is output file is a full sized model. Cloud can handle that.

JohnTigue commented 1 year ago

BLIP uses object recognition to look at images and auto-label them with a descriptive term. These can be used to drive a training session without the manual work of labeling the training data.

BLIP2 just came out:

JohnTigue commented 1 year ago

More BLIP2 news: BLIP2 is released. Looks awesome.

JohnTigue commented 1 year ago

Sounds like some folks find Stable Tuner useful for finetuning models.

JohnTigue commented 1 year ago

Sounds like Invoke 2.3.0 upped their game for Textual Inversion training: https://github.com/invoke-ai/InvokeAI/releases?q=2.3.0&expanded=true.

JohnTigue commented 1 year ago

How to Train Your Series: Abridged (Link to full guide in comments)

JohnTigue commented 1 year ago

LoRA is one of the main techniques: I made a LoRA training guide! It's a colab version so anyone can use it regardless of how much VRAM their graphic card has!.

JohnTigue commented 1 year ago

Anders cranked out a synthetic training data set for Theodoro. Those images are available in Google Drive: https://drive.google.com/drive/folders/1kSgP8eIZaAn5uT9Khq4Dv5uJJIr3djBP

Then need to be in the DreamBooth instance. Or we need to mount Googl drive within it: https://github.com/ManyHands/brain_trust/issues/84

JohnTigue commented 1 year ago

This is one of the best training tips videos I have stumbled upon yet: LORA + Checkpoint Model Training GUIDE - Get the BEST RESULTS super easy.