Open rumatoest opened 1 year ago
This sounds like a good idea. But using signal interceptors like Ctrl-C or Ctrl-D and changing their semantics feels like a bit of a hack. I wonder if we could simply intercept a key input via stdin
instead. So that pressing q
(for quit) or s
(for stop) would stop generation.
tonloc
on the Discord mentioned interest on working on this - would suggest anyone interested in this issue sync up with them first.
But using signal interceptors like Ctrl-C or Ctrl-D and changing their semantics feels like a bit of a hack.
EDIT: below is also pertitent to chat-experimental
and was written in that context.
I thought so at first, but llama.cpp does this and it's not really a hack if you consider the following REPL interfaces:
cmdline:
=> usr input
stdout
stdout
.....
ctr-c
=> usr input 2
stdout...
python -i
>>> for i in range(1010000000): print(i)
1
2
3
...
124254
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyboardInterrupt
>>> print("hello world")
hello world
llama-repl:
=> usr input
llama out
llama out
...
^C User Interrupt
=> usr input
llama out
...
Incidentally:
I also like if the repl shows a thinking/loading icon while it is not generating output. I do not like llama.cpp's interface which requires the user to press enter to toggle between context windows... I do like chat-experimental
's running without stopping. But, I could also see myself liking if the model were to decide it's time to ask for the user to prompt further.
Hello. I've found out that models very often tends to generate nonsense responses.
It is very unhandy when I'm in the
--repl
mode. I would like to stop this nonsense generator and enter another prompt but the only way to abort such madness isCTRL+C
which is exits REPL.Would you be so kind to add
CTRL+D
or something hotkeys to stop REPL from generating next tokens and start accepting new input?