kohya-ss / sd-scripts

Apache License 2.0
5.31k stars 880 forks source link

--always_first_tags not working (possibly) #1744

Open dpons222 opened 3 weeks ago

dpons222 commented 3 weeks ago

a quick summary before going in detail; i would like the --always_first_tags command line arg to also work as a "caption prefix" it does in the gui, but not when used as a command line arg

i want to preface that I am a beginner, apologies in advanced. i was reading through the wd_14_tagger_readme (https://github.com/kohya-ss/sd-scripts/blob/b8896aad400222c8c4441b217fda0f9bb0807ffd/docs/wd14_tagger_README-en.md) and one of the command line args is --always_first_tags. here it states that it will put the specified tags at the beginning of the captions; the downside is that it only adds captions to the beginning that would have been captioned by the tagger; you cannot choose a word to be prefixed, for example, a trigger word for your lora.

ALTHOUGH, also i was reading through the wd14_caption_gui.py (https://github.com/bmaltais/kohya_ss/blob/master/kohya_gui/wd14_caption_gui.py) and in line 261 (or if you search the page for "Prefix to add to WD14 caption") this, i assume, is the part of the actual GUI where you can prefix a chosen word (like trigger words for a lora, ohwx, etc) and it "takes" the always_first_tags

my request is, to either fix this bug, or if it is working as intended, to add a new command line arg --prefix_caption (or something)

if this helps, my workaround is a mediocre script/function to append my desired prefix to all the captions in a specified directory )dont laugh, i am beginner :) )

def add_prefix_to_captions(dataset_dir, prefix):
    """
    Adds a specified prefix to all caption files (.txt) in the given directory.
    """
    try:
        for root, dirs, files in os.walk(dataset_dir):
            for file in files:
                if file.endswith('.txt'):
                    file_path = os.path.join(root, file)
                    with open(file_path, 'r', encoding='utf-8') as f:
                        content = f.read()

                    if not content.startswith(prefix):
                        with open(file_path, 'w', encoding='utf-8') as f:
                            f.write(prefix + content)
        logging.info("Prefixes added to all captions.")
    except Exception as e:
        logging.error(f"An error occurred while adding prefixes: {e}")