Open Iluvalar opened 8 months ago
Use case, make someone happier, for 5 steps;
[\:D::5]
or after 5 steps;
[:\:D:5]
For exemple if I prompt " (blue:3)" one would expect " ['blue:3', 1.1]" not " ['blue', 3.3]"
i would expect that it parses to ['blue': 3.0]
since 3 is just int version of float 3.0.
what is the use-case of expanding it to ['blue:3', 1.1]
?
in either case, this is not an issue - if you want to change the behavior, that would be a feature request and i'm reluctant to ok it without a strong use-case.
"dex:15" <- I'm generating a D&D character "manacost:3" <- I'm making a magic the gathering card "0:0" <- I want that on a sign. Just because I want that on a sign. "age:23" <- why not? "quantity:5" "codename:007" "page:6" ":3" <- known emoticon "pi:3.1416"
I do realize now that I'm experimenting with it, that I can hack around by adding a random comma after the prompt and at least get an image. But it's not the same as being able to prompt it. I'm curious to prompt "codename:007" not ""codename:007 ." or "codename:007," or "codename:007 \" These all give different James Bond with the same seed.
Because their are hack around, it's not the strongest of use case. However, I still think one would expect to be able to escape the ":" character. It shouldn't be too hard to implement and it will only affect a minimal amount of prompt. beside "/!\:1.4" I can't really imagine what other prompt this would affect. And one would only need to prompt "/!\\:1.4" to get the desired result.
Edit: Git escaped my "\:" characters in the message.
my concern is that i'd be changing normal/documented/expected behavior to fit an edge-case where behavior is anything but documented. why? lets skip weights and just look at raw prompt quantity:5
tokenizer and text-encoder will happy do something, but what is that something? does clip
actually define what is the meaning of such prompt?
only way to tell (that i can think of) is to run raw prompt via tokenizer and encoder, get tokens and then do reverse lookups for those tokens and compare with different variations - for example, what is the meaning of quantity:5
vs quantity,5
or just quantity 5
or even quantity5
without space (since tokenizer breaks words anyhow)?
that's a valid and interesting thing to analyze, but i cannot afford the time right now.
so if you want this feature, then you'd have to explain what is that quantity:5
exactly does. not visually, but actual behavior.
modifying prompt parser to fit edge case of undocumented behavior would be less than ideal.
It's already documented. It is in the documentation that people should use "\" to signify the backslash.
I certainly understand that your time is limited. And this is obviously not be a life or death matter. I'm just curious to explore the usage of a new character and find myself limited by the lack of escape character for it. But I will survive it's pure curiosity.
i mean what is documented behavior of quanity:5
in clip tokenizer/text-encoder?
Yes, I understand your question. I was trying to dodge it. :D
As I use the token inspector in automatic1111. It seems that clip have plenty of tokens that contain ":". It seems to me that it interpret it just like any other characters.
However, the way you ask this, make me fear there is something deeper of sens that I'm missing in your question... I don't see why clip would treat ":" as a special character or how. But I have no quick way to test it. As I've been spoiled by automatic from most of my image generating experience.
Similar tokens: :-)(4223) ;-)(10475) :-((25137) :)(1408) 😃(8520) :-(13021) 😀(7334) 😊(3020) :))(10164) ;)(3661) 🙂(14860) :)))(21840) 😃(14079) :-(10323) 😄(7624) :((5965) 😊😊😊(28685) 😀(12832) .....(3104) :')(10701) ....(1390) =)(17840) :/(13604) 😊(4692) 😁(4821) ☺(8703) 😋(7203) ))(5167) 😊😊(20842) lol(1824)
so all the known instances of :
in clip are emojis - not surprising.
and that's my point - if we do escaping as you suggest and pass quantiity:5
to clip as-is, what does that even mean to clip? its not recognized so tokenizer will do some word-breaking and tokenize each section separately.
so how then quantity:5
is different than quantity5
or quantity 5
perhaps clip would use different break between them, but none of them are really documented behavior. so why would writing quantity:5
be speciually supported in sdnext when its not clip itself?
I'm sorry, I gave you a list of tokens most similar to ":-)". Here is ":"
Similar tokens: :(281) .(269) ,(267) !(256) ;(282) #(258) -(268) ?(286) "(257) ((263) @(287) ":(7811) ...(678) =(284) ):(4143) ':(7182) !:(17545) :"(12089) ."(1081) :(25) for(556) of(539) !!(748) at(536) 's(568) :-(10323) and(537) is(533) to(531) !!!(995)
It have the same magnitude as other characters like "a" , "b" or "!"
Those AI learned to speak English via training. ":" itself is a token. I'm not sure why you expect it to behave differently?
I don't understand why since i can use the character "🧛♀️" to make female vampire. and "🧛🏻♀️" to make one with a slightly paler skin. Because clip & SD learned the skin modifier character I would doubt that it learned something regarding a common character like ":".
Good news! I found a viable work around! I made myself an empty embedding. (0.0001% of "," to be specific because pure empty didn't save)
This way i can make all the blue cute kitties i want with the prompt: "(:3 emb_empty:1.5) blue" . I guess it also prove that the characters combination yield something unique.
This work because the code I shown in the opening message test to see if all the part right of the ":" character is numerical. So by adding an empty embedding I manage to do everything I'd want from this.
Issue Description
As previously stated in this previous issue: https://github.com/vladmandic/automatic/issues/1071 Escaping a colon, do not escape the colon.
For exemple if I prompt " (blue\:3)"
one would expect " ['blue:3', 1.1]" not " ['blue\', 3.3]"
It would always be possible to prompt "(blue\:3)" if the later was ever desired.
In other words, I believe we'd need a " elif text == ':'" on line 200 and the relevant code on 201 of
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/c1c27dad3ba371a5ae344b267c760aa51e77f193/modules/prompt_parser.py
Version Platform Description
No response
Relevant log output
No response
Backend
Original
Branch
Master
Model
SD 1.5
Acknowledgements