jhc13 / taggui

Tag manager and captioner for image datasets
GNU General Public License v3.0
768 stars 36 forks source link
cogvlm florence-2 image-captioning image-tagging llava pyside6 stable-diffusion tag-manager

TagGUI

TagGUI icon

Cross-platform desktop application for quickly adding and editing image tags and captions, aimed towards creators of image datasets for generative AI models like Stable Diffusion.

TagGUI screenshot

Features

Installation

The easiest way to use the application is to download the latest release from the releases page. Choose the appropriate file for your operating system, extract it wherever you want, and run the executable file inside. You may have to install 7-Zip to extract the files if you don't have it on your system.

Alternatively, you can install manually by cloning this repository and installing the dependencies in requirements.txt. Run taggui/run_gui.py to start the program. Python 3.11 is recommended, but Python 3.10 should also work.

Usage

Load the directory containing your images by clicking the Load Directory button in the center of the window (or File -> Load Directory). Tags are loaded from .txt files in the directory with the same names as the images. Any changes you make to the tags are also automatically saved to these .txt files.

Automatic Captioning

Auto-captioner screenshot

In addition to manual tagging, you can automatically generate captions or tags for your images inside TagGUI. GPU generation requires a compatible NVIDIA GPU, and CPU generation is also supported.

To use the feature, select the images you want to caption in the image list, then select the captioning model you want to use in the Auto-Captioner pane. If you have a local directory containing previously downloaded models, you can set it in File -> Settings to include the models in the model list. Click the Start Auto-Captioning button to start captioning. You can select multiple images to batch generate captions for all of them. It can take up to several minutes to download and load a model when you first use it, but subsequent generations will be much faster.

Captioning parameters

Prompt: Instructions given to the captioning model. Prompt formats are handled automatically based on the selected model. You can use the following template variables to dynamically insert information about each image into the prompt:

An example prompt using a template variable could be Describe the image using the following tags as context: {tags}. With this prompt, {tags} would be replaced with the existing tags of each image before the prompt is sent to the model.

Start caption with: Generated captions will start with this text.

Remove tag separators in caption: If checked, tag separators (commas by default) will be removed from the generated captions.

Discourage from caption: Words or phrases that should not be present in the generated captions. You can separate multiple words or phrases with commas (,). For example, you can put appears,seems,possibly to prevent the model from using an uncertain tone in the captions. The words may still be generated due to limitations related to tokenization.

Include in caption: Words or phrases that should be present somewhere in the generated captions. You can separate multiple words or phrases with commas (,). You can also allow the captioning model to choose from a group of words or phrases by separating them with |. For example, if you put cat,orange|white|black, the model will attempt to generate captions that contain the word cat and either orange, white, or black. It is not guaranteed that all of your specifications will be met.

Tags to exclude (WD Tagger models): Tags that should not be generated, separated by commas.

Many of the other generation parameters are described in the Hugging Face documentation.

Advanced Image List Filtering

The basic functionality of filtering for images that contain a certain tag is available by clicking on the tag in the All Tags pane. In addition to this, you can construct more complex filters in the Filter Images box at the top of the Images pane.

Click here to see the full documentation for the filter syntax. ### Filter criteria These are the prefixes you can use to specify the filter criteria you want to apply: - `tag:`: Images that have the filter term as a tag - `tag:cat` will match images with the tag `cat`. - `caption`: Images that contain the filter term in the caption - The caption is the list of tags as a single string, as it appears in the `.txt` file. - `caption:cat` will match images that have `cat` anywhere in the caption. For example, images with the tag `orange cat` or the tag `catastrophe`. - `name`: Images that contain the filter term in the file name - `name:cat` will match images such as `cat-1.jpg` or `large_cat.png`. - `path`: Images that contain the filter term in the full file path - `path:cat` will match images such as `C:\Users\cats\dog.jpg` or `/home/dogs/cat.jpg`. - You can also use a filter term with no prefix to filter for images that contain the term in either the caption or the file path. - `cat` will match images containing `cat` in the caption or file path. The following are prefixes for numeric filters. The operators `=` (`==` also works), `!=`, `<`, `>`, `<=`, and `>=` are used to specify the type of comparison. - `tags`: Images that have the specified number of tags - `tags:=13` will match images that have exactly 13 tags. - `tags:!=7` will match images that do not have exactly 7 tags (images with less than 7 tags or more than 7 tags). - `chars`: Images that have the specified number of characters in the caption - `chars:<100` will match images that have less than 100 characters in the caption. - `chars:>=30` will match images that have 30 or more characters in the caption. - `tokens`: Images that have the specified number of tokens in the caption - `tokens:>75` will match images that have more than 75 tokens in the caption. - `tokens:<=50` will match images that have 50 or fewer tokens in the caption. ### Spaces and quotes If the filter term contains spaces, you must enclose it in quotes (either single or double quotes). For example, to find images with the tag `orange cat`, you must use `tag:"orange cat"` or `tag:'orange cat'`. If you have both spaces and quotes in the filter term, you can escape the quotes with backslashes. For example, you can use `tag:"orange \"cat\""` for the tag `orange "cat"`. An alternative is to use different types of quotes for the outer and inner quotes, like so: `tag:'orange "cat"'`. ### Wildcards You can use the `*` character as a wildcard to match any number of any characters, and the `?` character to match any single character. For example, `tag:*cat` will match images with tags like `orange cat`, `large cat`, and `cat`. ### Combining filters Logical operators can be used to combine multiple filters: - `NOT`: Images that do not match the filter - `NOT tag:cat` will match images that do not have the tag `cat`. - `AND`: Images that match both filters before and after the operator - `tag:cat AND tag:orange` will match images that have both the tag `cat` and the tag `orange`. - `OR`: Images that match either filter before or after the operator - `tag:cat OR tag:dog` will match images that have either the tag `cat` or the tag `dog`, or both. The lowercase versions of these operators will also work: `not`, `and`, and `or`. The operator precedence is `NOT` > `AND` > `OR`, so by default, `NOT` will be evaluated first, then `AND`, then `OR`. You can use parentheses to change this order. For example, in `tag:cat AND (tag:orange OR tag:white)`, the `OR` will be evaluated first, matching images that have the tag `cat` and either the tag `orange` or the tag `white`. You can nest parentheses and operators to create arbitrarily complex filters.

Controls

Images pane

Image Tags pane

All Tags pane

The Edit menu contains additional features for batch tag operations, such as Find and Replace (Ctrl+R) and Batch Reorder Tags (Ctrl+B).