derrian-distro / LoRA_Easy_Training_Scripts

A UI made in Pyside6 to make training LoRA/LoCon and other LoRA type models in sd-scripts easy
GNU General Public License v3.0
998 stars 101 forks source link

Textual inversion #197

Closed miabrahams closed 3 months ago

miabrahams commented 3 months ago

Hello! I started experimenting with textual inversion training and realized it would be pretty simple to add to your GUI, since almost all of the command line arguments are the same.

I wrote an additional UI component that allows the user to specify initial_word and token_string. In the backend, I added a new query parameter to the /train route - train_mode = "lora" or "textual_inversion." The backend chooses the right script to run, based on that parameter plus the sdxl parameter already there.

I had to make some UI design choices for this. I created a menu on the top of the screen to select LoRA or Textual Inversion mode. This will hide the Network Properties args widget and display the Textual Inversion widget. I didn't add it to the Network Properties because I wanted to support future compatibility with methods like Pivotal Tuning, where a TI is trained together with an additional network. This isn't in sd_scripts currently, but it might be in the future. Also, NetworkWidget already has a lot of stuff going on.

Right now when starting training I dump everything into config.toml, both network and TI, since sd_scripts ignores irrelevant items. That would be easy to change.

I also added something else for fun. When training a TI, sd_scripts will let you manually override the number of tokens to train. By default, it will train the same number of tokens as the length of your input string. But how do we know how long that is going to be? We need to run CLIPTokenizer on the initial word.

To do that I added a new route in the backend, /tokenize, which runs CLIPTokenizer. The frontend will send requests at a 200ms debounce rate to get the number of tokens in the init string, and updates the widget with the current count. To manage these requests I used QNetworkManager. QNetworkManager makes async calls fairly simple and less error-prone when integrating with the UI in my experience. I could port the async code in DragDropLineEdit and main.py to use it if you're interested.

This comes with a PR for the backend.

image

derrian-distro commented 3 months ago

Hi, thanks for making the PR! it looks like you followed my internal logic fairly well, but there is some stuff I'll change up to match better. I've never actually trained a TI before, so making a UI for it definitely didn't cross my mind. I'll run through the code you wrote and figure out what should be changed to better match up, but from my cursory look, it's very good already!

I had intended to have a contribution guide up by now, because my internal coding style is rather strict for keeping maintaining the code easier, but time ruined that one!