This is a conversation starter and maybe should be thrown away.
Basic idea:
prompt "A,B,C,D" can be permuted in the middle of training with an optional number of leading comma separated sections not getting permuted.
Anyways, labeled as a POC because there isn't a clear ideal way to do this, or at least this isn't it.
Is this how you want this done? Or doing the shuffling against the actual text and recreating input_ids every time make more sense, or something else entirely?
I'm not familiar with pytorch so I suspect what I added can be optimized.
For danbooruy prompts this improved the dropout behavior of individual words, decreased overfitting of the sample prompt and makes order not matter. I didn't try this on more normal 1.4/1.5 prompts. I imagine it would do something similar.
What is not implemented and would be worth doing is partial prompt dropout, or at least I've seen that done with some of the other training tools.
So a technically working implementation of this with a bunch of asterisks:
Not wired into the gui, changing shuffle_after_nth_comma is easy enough to validate this experiment
Has an impact on performance, 10%ish, your mileage may vary. Running on a 3060 12gb.
Only works with the ",<\w>" comma token, aka comma with optional trailing whitespace.
Wasn't tested on anything other than well formed comma laden sample prompts.
This is a conversation starter and maybe should be thrown away.
Basic idea: prompt "A,B,C,D" can be permuted in the middle of training with an optional number of leading comma separated sections not getting permuted.
Anyways, labeled as a POC because there isn't a clear ideal way to do this, or at least this isn't it. Is this how you want this done? Or doing the shuffling against the actual text and recreating input_ids every time make more sense, or something else entirely?
I'm not familiar with pytorch so I suspect what I added can be optimized.
For danbooruy prompts this improved the dropout behavior of individual words, decreased overfitting of the sample prompt and makes order not matter. I didn't try this on more normal 1.4/1.5 prompts. I imagine it would do something similar.
What is not implemented and would be worth doing is partial prompt dropout, or at least I've seen that done with some of the other training tools.
So a technically working implementation of this with a bunch of asterisks: