LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.98k stars 349 forks source link

How to enable infinite generations? #820

Open yukiarimo opened 5 months ago

yukiarimo commented 5 months ago

I want the model, no matter what tokens are printed, to continue forever until I stop it. Is it possible?

kaylaa0 commented 5 months ago

Would banning EOS (End-Of-Stream) token be a possible solution ?

Settings -> Advanced -> EOS Token Ban

yukiarimo commented 5 months ago

Yes, this helps in the token terms, but there's an Amount to Gen option that is max 512. Can it be any number like 99999 for AI to generate an entire novel for me?!

LostRuins commented 5 months ago

It can be set higher, up to about 80% of the max context length. Try increase your max context length first, then manually override and input the amount to gen as a larger number than 512.

aleksusklim commented 5 months ago

Not only you can type any number to "amount to generate", but also you can just press the sending button with empty input box to force the model to continue right where it stopped! So you don't actually need very long chunk of text unless you leave your machine while it busy generating.

Personally, I edit model's output often, so it's useless to generate too long text, since it would be reprocessed in my next turn if I'll edit something.

Denplay195 commented 5 months ago

Also. you may use Idle Responses to let the generating go infinitely further as if you were doing it manually, the only problem is it can stuck somewhere even after banning EOS tokens, but for me it is a rare occasion

yukiarimo commented 5 months ago

Idle Responses

What is that?

Denplay195 commented 5 months ago

What is that? I wonder if you are using KoboldAI Lite or not, but it has that in advanced settings image

yukiarimo commented 5 months ago

I wonder if you are using KoboldAI Lite or not, but it has that in advanced settings

I do. I'm just running Python koboldcpp.py! Anyway, what does this thing do?

LostRuins commented 5 months ago

Idle Responses allow the AI to automatically continue the response without user input after some amount of inactive time.