AUTOMATIC1111 / stable-diffusion-webui-tokenizer

An extension for stable-diffusion-webui that adds a tab that lets you preview how CLIP model would tokenize your text.
141 stars 22 forks source link

Add support for SDXL model #9

Open zero41120 opened 1 month ago

zero41120 commented 1 month ago

This extension works with the SD1.5 models but fails with SDXL. In this code update, I checked the source code and identified that SDXL at this line: https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/feee37d75f1b168768014e4634dcb156ee649c05/modules/sd_hijack_clip.py#L349.

The current logic only checks shared.sd_model.cond_stage_model.wrapped, while the hijacked SDXL FrozenCLIPEmbedderForSDXLWithCustomWords is located in shared.sd_model.cond_stage_model.embedders[0].wrapped.

I lack the technical expertise to confirm differences in the tokenizer. However, SD1.5 and SDXL appear to produce the same ID, source code also suggests the vocab maps to Unicode, making any model suitable.

In this pull request, I refactored the logic so VanillaClip and OpenClip no longer depend on instance checks. They now verify the required attribute. I passed embedders[index].wrapped to the Clip classes, so the first matching attribute is used for tokenization. Since SDXL and SD1.5 produce the same tokenization, I didn't include logic to identify the best "embedder" (which should be FrozenCLIPEmbedderForSDXLWithCustomWords), as they likely yield the same result.

Lastly, token colors were barely visible in dark theme. I updated them to use the theme color instead of hardcoded RGB.

Screenshot 2024-06-13 at 18 15 04 Screenshot 2024-06-13 at 18 15 15