Open a websocket between the client and server so that the server can send streaming responses from the LLM (llama)
Not a trivial change, because we need to support streaming the LLM responses, but also be able to continue supporting the existing non-streaming, non-LLM responses such as from the swap agent, etc. The trickiness is in how we handle the end of response tags for all of these / response terminators.