DominikDoom / a1111-sd-webui-tagcomplete

Booru style tag autocompletion for AUTOMATIC1111's Stable Diffusion web UI
MIT License
2.55k stars 306 forks source link

Tag autocomplete for .yaml files #84

Closed ctwrs closed 1 year ago

ctwrs commented 1 year ago

Hi, I'm contributing to https://github.com/Klokinator/UnivAICharGen for which you've added files in wce.txt to get autocomplete working.

So we've just released a feature on top of wildcards_recursive.py that allows for YAML wildcards. What this allows is to prompt stuff like:

<[Tag1][--Tag2][Tag3|Tag4]>

This then takes Tag1, removes any Tag2 occurrences for a given prompt and narrows down to either Tag3 or Tag4.

Also it allows for namescoped prompts with tags like:

<Clothing:[Cape]> <Clothing:[Jewelery]>

To get a glimpse of the definitions check out https://github.com/Klokinator/UnivAICharGen/blob/master/wildcards/Clothing.yaml

So we basically scrape all .yaml files in wildcards for our extension and join them together in one big object. For now the files are split and their naming is stable (maybe there will be some additions). Capitalization is optional, we lowercase everything for the prompt building logic.

Any suggestions how to add support for this? Do you want to take it and play with YAML parsing or should I issue a PR? If yes then do you have any quick tips on where to start or how you would want to have this implemented?

I was thinking of implementing the suggestion thing myself but from the looks of tagAutocomplete.js it will be more work than it's really worth :).

If you don't want to bother with this then I can issue a PR, I'm more proficient in JS than Python so I should find my way around just fine. Just wanting to check in with you.

catwars#8894 on dc if you want to dm.

DominikDoom commented 1 year ago

Sounds interesting, main issue would be that javascript lacks native yaml parsing, so I would need either a browser js library that can load them (like https://github.com/nodeca/js-yaml) or do a workaround and convert them in python to a directly readable format (or manual parsing, but that would be quite a headache comparatively).

Another problem I've seen is nested definitions like __Colors__ Gloves or Metallic {Iron|Steel|Metal|Obsidian} Gauntlets, since the autocomplete script is intended to do replace while typing it isn't really equipped to handle them intuitively. But I don't know if that's exclusive to the yaml format or in general for your system.

Other than that it could probably be done using the existing methods, I'm just not sure of the format. < is currently used for embedding completion, so that would be a collision if your system also starts with that instead of the old double underscore format for wildcards. Also could be hard to differentiate if the user wants to currently search completions for the namespace, the tags, or the actual term. The current system makes it easy to separate since the user essentially searches in one file at a time, and the files are the namespaces so to speak if arranged in a proper folder structure. But from your first example, it seems the namespace is optional and the user can also use the wildcard tags directly. From that perspective, it might also need a completely new parsing system that can differentiate between them somehow.

Can you give me a few more details what the ideal workflow with autocomplete would look like for this system? E.g. in which order to search for namespaces/tags/replacements, what should be optional, what typing should trigger the separate parts, etc.

ctwrs commented 1 year ago

Script-wise we have backwards compatibility with __ and it can be used interchangeably with <>. We also do a fallthrough if no match is found to keep embeddings working.

Having namespace completion would be nice but I think it would be best to have it at least for tags, like <[Presets][SFW]> where after you type [ you would get tag completion with the nr of matching prompt parts.

Showing it after [ would also remove collisions. Moreover, embedding completion could suggest [ to go into tag completion.

Don't worry also about actual prompt parts and their recursion, that's handled by the script. Completion should really only be for tags.

Creating a .txt with tags and their count at webui load is doable and pretty easy, we can even handle it in our extension and fetch it to a global var in JS.

Parsing yaml in JS is a pain. I actually did that for a tool that previews prompt matches but it's written in deno fresh. I still have to update it for not namespaced prompts but it's at https://umi-prompt.deno.dev/ https://github.com/ctwrs/umi-prompt/blob/master/islands/Main.tsx

-------- Original Message -------- On Dec 17, 2022, 3:03 PM, DominikDoom wrote:

Sounds interesting, main issue would be that javascript lacks native yaml parsing, so I would need either a browser js library that can load them, or do a workaround and convert them in python to a directly readable format (or manual parsing, but that would be quite a headache comparatively).

Another problem I've seen is nested definitions like Colors Gloves or Metallic {Iron|Steel|Metal|Obsidian} Gauntlets, since the autocomplete script is intended to do replace while typing it isn't really equipped to handle them intuitively. But I don't know if that's exclusive to the yaml format or in general for your system.

Other than that it could probably be done using the existing methods, I'm just not sure of the format. < is currently used for embedding completion, so that would be a collision if your system also starts with that instead of the old double underscore format for wildcards. Also could be hard to differentiate if the user wants to currently search completions for the namespace, the tags, or the actual term. The current system makes it easy to separate since the user essentially searches in one file at a time, and the files are the namespaces so to speak if arranged in a proper folder structure. But from your first example, it seems the namespace is optional and the user can also use the wildcard tags directly. From that perspective, it might also need a completely new parsing system that can differentiate between them somehow.

Can you give me a few more details what the ideal workflow with autocomplete would look like for this system? E.g. in which order to search for namespaces/tags/replacements, what should be optional, what typing should trigger the separate parts, etc.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

DominikDoom commented 1 year ago

Alright, thanks for the additional info. Using [ to avoid collisions sounds feasible, so I'll get to it in the next few days.

ctwrs commented 1 year ago

Cool! Thanks :). I can do the txt prep part if needed. Also in a few days. Currently I'm in Xmas trip prep mode so barely got time to eat.

Happy holidays! 🎅

-------- Original Message -------- On Dec 17, 2022, 6:10 PM, DominikDoom wrote:

Alright, thanks for the additional info. Using [ to avoid collisions sounds feasible, so I'll get to it in the next few days.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

DominikDoom commented 1 year ago

No worries, I'll figure something out either way when I get to that point, before dummy data will work just fine. The main work is in the typing parser anyway to support stuff like the negative and or syntax. I might even restructure the parsing as a whole while I'm at it, I've wanted to split it up into separate parser files for a while now since the main script is getting a bit annoying to extend as a monolith.

Happy holidays to you too!

DominikDoom commented 1 year ago

I pushed what I was able to implement so far to the feature-yaml-wildcards branch. Please try it out and tell me what you think, I tried to make it as compatible as possible with existing features and hopefully didn't break anything else doing so.

At the moment the yaml wildcard tags are sorted by their count, but alphabetical shouldn't be a problem either if you prefer that. The completion will show as soon as you type <[. At the moment the square bracket isn't closed automatically on insertion since the user might want to follow up with | instead, but I think that could be a settings option later. The Umi tag completion is essentially handled as a sub-prompt to make the tag chaining in your example above possible, so to return to normal autocomplete with the main tags it needs to be properly closed off with >. So from a current technical standpoint normal tags won't show during editing a Umi subprompt, hopefully that isn't an issue.

There's still one known issue I eventually want to fix before merging it into main, which is that on the first tag the completion window will reappear after typing the closing square bracket. So definitely still WIP, but I wanted some general feedback before digging too deep since I'm not familiar with how your system works exactly.

Klokinator commented 1 year ago

Thank you so much for adding this! I'm the head dev of Umi's content library. Catwars is the coder. You've helped us out a bunch!

Klokinator commented 1 year ago

Oh, and I just published a not-yet-finished but very WIP guide regarding our tags.

https://github.com/Klokinator/UnivAICharGen/wiki/Umi-AI%27s-Tagging-System

ctwrs commented 1 year ago

Suggestions looks awesome. Works like I envisioned it.

it's certainly buggy right now. It doesn't seem to bug out existing features but existing features surely interfere with this. Hard to really write any sensible repro steps because it's pretty easy to break now.

I'm having massive feature creep suggestions that I keep contained for now so feature-wise this is what we really need. I'll possibly play with the code once this version is stable.

Code-wise I would split out any RegExps to top scope vars. RegExp init can be slow, but more importantly having a named RegExp is far more readable about intentions of the RegExp. With that separated out I would do a simple jest setup to test all of these RegExps. From what I did for umi, most of the issues were from badly written RegExps :P.

If not as jest boilerplate, then as a link to regexr already with sample data next to a regexp.

DominikDoom commented 1 year ago

Alright, thanks for the info.

it's certainly buggy right now. It doesn't seem to bug out existing features but existing features surely interfere with this. Hard to really write any sensible repro steps because it's pretty easy to break now.

Even if no repro steps, just a list of what seems to work unintended would be a great help. So far I'm mainly aware of the wrong part/too much of the text being replaced if the user chooses a result without typing.

Code-wise I would split out any RegExps to top scope vars. RegExp init can be slow, but more importantly having a named RegExp is far more readable about intentions of the RegExp. With that separated out I would do a simple jest setup to test all of these RegExps. From what I did for umi, most of the issues were from badly written RegExps :P.

Sooner or later definitely, at least for the constant regexes for prompt parsing. I'm already having a regexr setup for testing anyway. The biggest issue is how to differentiate the umi prompt from the normal promp during regex parsing. This is made harder by the yaml tags having spaces, since normal tags break on them I essentially have to find combinations that work with everything.

ctwrs commented 1 year ago
ctwrs commented 1 year ago

For RegExps it makes sense to do multiple passes on matching, one to get the prompt type, another to get tags / prompt parts. This way the RegExps don't grow out of control. Also the spaces thing would probably benefit by having the end ] inserted.

Really, having syntax suggestions would probably help but i think thats a bit hard to implement right now without issues.

DominikDoom commented 1 year ago

Double passes is already pretty much what I'm doing, the regex first checks if the user is currently editing a umi tag group (so something starting with <[, with spaces supported and optional closing brackets). And then a second pass where it captures the actual tags. Then the rest is pretty similar to the normal prompt tags, with a difference check to get what the user is currently editing. The main issue wasn't the regex but that difference check not updating correctly after inserting a tag.

With the most recent push I think most of the insertion & double suggestion issues got taken care of, the biggest remaining issue I'm aware of is that now every umi tag group after the first doesn't show the full list without the user typing something, and the same issue during editing. But I should be able to fix that (probably tomorrow). Regular completion seems to work pretty stable now (feel free to correct me on that if you find something), aside from a few edge cases where the umi prompt isn't closed correctly.

DominikDoom commented 1 year ago

I believe it is stable now, I was able to fix most of the known issues except that the "All tags" list isn't shown again when fully deleting a tag during editing. But that's pretty hard to detect during parsing since the autocomplete only captures the tags themselves and doesn't care about their position in the prompt. The list will still show though after the user types the first letter, so I think it's a minor inconvenience overall.

Feel free to try it out again, if you don't spot anything else I think we can push it to main.

ctwrs commented 1 year ago

@DominikDoom works good enough for me. Doesn't look like it's bugging out and that all tags issue is an edge case that can be fixed later. For now getting suggestions working is the main thing.

It can be merged IMO. Thanks a lot for this work :)!

DominikDoom commented 1 year ago

Glad I could help, and thanks for the various feedback! Yeah, edge cases and general refactoring are definitely on the list, but after the holidays when I have more time. Feel free to open new issues or pull requests regarding that of course.