Streaming doesn't work - Githubissues

Even Azure models that support streaming won't do it; the entire response is always returned in one chunk.

I have found no way to enable streaming using configuration, and from the code it doesn't seem possible. The problem appears to be with the can_stream property of the llm.Model class. Even if you define it using a config.yaml, it is ignored by the llm-azure plugin. AzureChat extends the OpenAI Chat, which in turn extends llm.Model. In Chat, can_stream is True by default, but this doesn't take effect because AzureChat doesn't call super().__init__(), so it becomes effectively False for all Azure models.

I propose to check config.yaml for a can_stream key, use it if present, and assume True otherwise. I will submit a PR shortly.

fabge / llm-azure

Streaming doesn't work #1