This is a WebUI tool to edit training dataset for Text2Image Models.
This is a standalone version of Dataset Tag Editor, which is an extension for Stable Diffusion web UI by AUTOMATIC1111.
Please do not put into extensions
folder of AUTOMATIC1111's webUI.
It works well with text captions in comma-separated style (such as the tags generated by DeepDanbooru interrogator).
Caption in the filenames of images can be loaded, but edited captions can only be saved in the form of text files.
(Pros)
(Cons)
All requirements are listed in requirements.txt
Please install the followings first:
If you want to use DirectML, please install manually in venv (install pytorch-directml to enable, not tested).
This script will install ONNX runtime automatically in venv
before using wd-taggers by SmilingWolf.
Just run install.bat
Run following commands on the root directory of this repo.
python3 -m venv --system-site-packages venv
source ./venv/bin/activate
pip3 install -r requirements.txt
(Note: just .\venv\Scripts\activate
is needed to activate venv on Windows)
You can see available command line args with -h
or --help
option.
Just run launch_user.bat
source ./venv/bin/activate
python scripts/launch.py [arguments]
Google Colab users can using it by executing the following command and accessing the generated Gradio Public URL.
(Probably, I think this is currently only available in the Colab Pro.)
%cd /content
!git clone https://github.com/toshiaki1729/dataset-tag-editor-standalone.git
%cd /content/dataset-tag-editor-standalone
!pip install -r requirements.txt
!python scripts/launch.py --share
Note. "tag" means each blocks of caption separated by commas.
userscripts/taggers
(they have to be wrapped by a class derived from scripts.tagger.Tagger
)
Basic workflow is as follows:
Please note that all batch editing will be applyed only to displayed images (=filtered images).
No filter is required.
The same as replacing. Just replace the tags with "blank".
Also you can use "Remove" tab in "Batch Edit Captions".
(maybe in >= v0.0.6)
If you want to load images from other directory than this app, you should register the directory in whitelist in the "Settings" tab, or use temporary image file (as same as the next section).
Input path in "Path whitelist to show images …" and save settings.
You can input drive name like "C:\" (Windows).
(maybe in <= v0.0.5)
Set folder to store temporaly image in the "Settings" tab.
Input path in "Directory to save temporary files" and check "Force using temporary file…" and save settings.
Input non-zero number in "Maximum resolution of ..." in the "Settings" tab to use smaller thumbnail for the image gallery.
It may not work with dataset with millions of images.
-U
(--upgrade
) optionpip3 install -U torch torchvision --index-url https://download.pytorch.org/whl/cu118
venv
folderinstall.bat
activate
venv and install PyTorch with -U
option, or do the following things:
launch_user.bat
with some text editorset COMMANDLINE_ARGS="--force-install-torch cu118"
(you can choose from cu117
, cu118
, cu121
or cpu
)launch_user.bat