Streaming to standard input fails with an error

simonw commented 2 hours ago

curl -s 'https://static.simonwillison.net/static/2024/russian-pelican-in-spanish.mp3' | llm whisper-api -

  File "/opt/homebrew/Cellar/llm/0.16/libexec/lib/python3.12/site-packages/h11/_connection.py", line 512, in send
    data_list = self.send_with_data_passthrough(event)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/llm/0.16/libexec/lib/python3.12/site-packages/h11/_connection.py", line 545, in send_with_data_passthrough
    writer(event, data_list.append)
  File "/opt/homebrew/Cellar/llm/0.16/libexec/lib/python3.12/site-packages/h11/_writers.py", line 65, in __call__
    self.send_data(event.data, write)
  File "/opt/homebrew/Cellar/llm/0.16/libexec/lib/python3.12/site-packages/h11/_writers.py", line 91, in send_data
    raise LocalProtocolError("Too much data for declared Content-Length")
h11._util.LocalProtocolError: Too much data for declared Content-Length

I need to read it into memory first so I can calculate the correct length for the upload.

simonw commented 2 hours ago

Here's a weird detail:

https://github.com/simonw/llm-whisper-api/blob/3a037aef3fd6e12355f377b37fa757b2d0440229/llm_whisper_api.py#L52-L53

If you omit the filename, the OpenAI API returns a 400

If you leave off the .mp3 extension (I tried just audio_file.name = "audio") you get a 400.

But... it turns out you can send .wav data or .mp3 data with the same audio.mp3 filename and both work just fine.

simonw commented 2 hours ago

This works now:

curl -s 'https://static.simonwillison.net/static/2024/russian-pelican-in-spanish.mp3' \
  | llm whisper-api -

I need you to pretend to be a California brown pelican with a very thick Russian accent, but you talk to me exclusively in Spanish. How's your day today? And you, amigo, how was your day?

(Note it auto-translated from Spanish to English even though I didn't ask it to.)

simonw / llm-whisper-api

Streaming to standard input fails with an error #2