How to send TTS reply sentence by sentence for longer text

Maybe related observation: by default my C# ASP.NET API uses Transfer-Encoding: chunked and it does not return a Content-Length header. In that case willow just reads aloud "Success" instead of the body I send, because it fails to determine the length. If I change my code to force it to send Content-Length, then it reads the body correctly.

This got me thinking... could my request above be implemented using chunked transfer encoding?

Something like this proposal from GPT-4:

[HttpGet("stream")]
public async Task StreamResponse()
{
    Response.Headers.Add("Transfer-Encoding", "chunked");
    foreach (var part in GetDataParts())
    {
        await Response.WriteAsync(part);
        await Response.Body.FlushAsync(); // Important to flush the stream
        // Simulate some real-time delay or processing
        await Task.Delay(1000);
    }
}

private IEnumerable<string> GetDataParts()
{
    yield return "Part 1 ";
    yield return "Part 2 ";
    yield return "Part 3 ";
}

The difficulty is that then the ESP box would need to keep contacting the inference server to get audio for each separate sentence as in comes in.

toverainc / willow

How to send TTS reply sentence by sentence for longer text #326