I'm probably missing something obvious here, but I can't get the new ContextShift feature to work. I've downloaded the newest koboldcpp version, launched the GUI, selected ContextShift (and deselectet SmartContext to be sure) and let it load. In the web-gui I reset all settings, switched to story mode and put in the first chapter of Oliver Twist. Then I let it generate. Of cause for the first round it said
I think that is to be expected? But than I let it generate some more and again it did the BLAS-Processing of the whole 1480 tokens every time I hit the "Generate some more"-button. Am I doing something wrong here? Are there any other options I need to enable? Does it only work with certain models (I'm currently using wizardLM-7B.ggmlv3.q4_1.bin)?
Memory and World-Info a completely empty.
I'm probably missing something obvious here, but I can't get the new ContextShift feature to work. I've downloaded the newest koboldcpp version, launched the GUI, selected ContextShift (and deselectet SmartContext to be sure) and let it load. In the web-gui I reset all settings, switched to story mode and put in the first chapter of Oliver Twist. Then I let it generate. Of cause for the first round it said
Processing Prompt [BLAS] (1480 / 1480 tokens) Generating (120 / 120 tokens)
I think that is to be expected? But than I let it generate some more and again it did the BLAS-Processing of the whole 1480 tokens every time I hit the "Generate some more"-button. Am I doing something wrong here? Are there any other options I need to enable? Does it only work with certain models (I'm currently using wizardLM-7B.ggmlv3.q4_1.bin)? Memory and World-Info a completely empty.