Textual inversion - Githubissues

Hello! I started experimenting with textual inversion training and realized it would be pretty simple to add to your GUI, since almost all of the command line arguments are the same.

I wrote an additional UI component that allows the user to specify initial_word and token_string. In the backend, I added a new query parameter to the /train route - train_mode = "lora" or "textual_inversion." The backend chooses the right script to run, based on that parameter plus the sdxl parameter already there.

I had to make some UI design choices for this. I created a menu on the top of the screen to select LoRA or Textual Inversion mode. This will hide the Network Properties args widget and display the Textual Inversion widget. I didn't add it to the Network Properties because I wanted to support future compatibility with methods like Pivotal Tuning, where a TI is trained together with an additional network. This isn't in sd_scripts currently, but it might be in the future. Also, NetworkWidget already has a lot of stuff going on.

Right now when starting training I dump everything into config.toml, both network and TI, since sd_scripts ignores irrelevant items. That would be easy to change.

I also added something else for fun. When training a TI, sd_scripts will let you manually override the number of tokens to train. By default, it will train the same number of tokens as the length of your input string. But how do we know how long that is going to be? We need to run CLIPTokenizer on the initial word.

To do that I added a new route in the backend, /tokenize, which runs CLIPTokenizer. The frontend will send requests at a 200ms debounce rate to get the number of tokens in the init string, and updates the widget with the current count. To manage these requests I used QNetworkManager. QNetworkManager makes async calls fairly simple and less error-prone when integrating with the UI in my experience. I could port the async code in DragDropLineEdit and main.py to use it if you're interested.

This comes with a PR for the backend.

derrian-distro / LoRA_Easy_Training_Scripts

Textual inversion #197