Having used chatGPT, it is very familiar to an average user that LLM responds to a user in chunks, without a need to wait for a whole answer to be loaded. It would be great if there was an option -s or --stream that would allow user to see how his response is being generated in real time.
Having used chatGPT, it is very familiar to an average user that LLM responds to a user in chunks, without a need to wait for a whole answer to be loaded. It would be great if there was an option
-s
or--stream
that would allow user to see how his response is being generated in real time.