transcription requests fail

eschmidbauer commented 6 months ago

Hello, I've installed/setup whisperd, but i cannot seem to get any requests to work. Here is the log output

May 15 16:50:12 whisperd[1158713]: NOTICE [src/whisperd-srvc-http.c:376]: added connector: 0.0.0.0:8088 (http)
May 15 16:50:12 whisperd[1158713]: NOTICE [src/whisperd-srvc-http.c:390]: added connector: 0.0.0.0:8443 (https)
May 15 16:50:12 whisperd[1158713]: NOTICE [src/whisperd.c:218]: whisperd-1.0 (a01) [pid:1158713 / uid:0 / gid:0] - started
May 15 16:50:12 whisperd[1158713]: NOTICE [src/whisperd.c:219]: hw-features: AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | COREML = 0 | OPENVINO = 0 |

Here is the config

<?xml version="1.0"?>
<configuration version="1">

    <http-service enabled="true" >
        <settings>
            <param name="address" value="0.0.0.0"/>
            <param name="port" value="8088"/>
            <param name="ssl-port" value="8443"/>
            <param name="cert-file" value="server.pem"/>
            <param name="access-secret" value="secret123"/>
            <param name="max-threads" value="16"/>
            <param name="max-websoc-connections" value="16"/>
            <param name="max-content-length" value="10048576"/>
            <param name="min-content-length" value="0"/>
            <param name="idle-timeout" value="100000"/>
        </settings>
    </http-service>

    <cluster-service enabled="false">
        <settings>
            <param name="address" value="0.0.0.0"/>
            <param name="port" value="5532"/>
        </settings>
    </cluster-service>

    <models>
        <model name="base"  file="ggml-base.en.bin" alias="whisper-1"/>
    </models>

    <whisper-worker>
        <settings>
            <param name="max-threads" value="5"/>
        <param name="max-tokens" value="100"/>
        <param name="sim-enabled" value="false"/>
        </settings>
    </whisper-worker>
</configuration>

curl -H "Authorization: Bearer secret123"    -H "Content-Type: multipart/form-data"    -F model="whisper-1"    -F opts="{\"language\":\"en\", \"max-tokens\":128, \"translate\":false}"    -F file="@test.wav" http://localhost:8088/v1/audio/transcriptions -v
*   Trying 127.0.0.1:8088...
* Connected to localhost (127.0.0.1) port 8088 (#0)
> POST /v1/audio/transcriptions HTTP/1.1
> Host: localhost:8088
> User-Agent: curl/7.81.0
> Accept: */*
> Authorization: Bearer secret123
> Content-Length: 8193518
> Content-Type: multipart/form-data; boundary=------------------------7ea3b949248bee8e
> Expect: 100-continue
>
* Done waiting for 100-continue
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

akscf commented 6 months ago

Hi, did it crash? try to add further debug into whisperd-srvc-http.c, function: http_request_handler() log_debug(), to figure out where it stops

eschmidbauer commented 6 months ago

I tried to add further log lines but it appears to crash without printing logs. but only the handler crashes. the app stays up for more requests

eschmidbauer commented 6 months ago

I'm wondering if it's because the whisper_cpp code has been updated. Do you have a working model you can share so i can test vs a new GGML model?

eschmidbauer commented 6 months ago

also - would you be able to share how to compile this with CUDA support?

akscf commented 6 months ago

Updated repo, and seems figured out why you didn't see debug messages, it didn't crash but by default on *bsd systems LOG_DEBUG is disabled so the log was empty (replaced it into LOG_NOTICE, that's going well) try this new version, old models: http://akscf.org/files/models.tar.gz

akscf commented 6 months ago

also - would you be able to share how to compile this with CUDA support?

Be honest haven't think about CUDA at all, actually it was written because I had to have on hand a solution to run the Whisper separately the Freeswitch and had the API similar to OpenAI.

Because of the rush not all decisions were good there ;) Going to redo this, and thinking about more flexible version via loadable modules, might a CUDA solution appears then.

eschmidbauer commented 6 months ago

hmm even with the models you shared and new logging code, still nothing... im wondering if it's related to my build (m1)

akscf commented 6 months ago

got it, sadly, no ideas... thought, try to replace 0.0.0.0 to 127.0.0.1 (or some exact ip)

akscf / whisperd

transcription requests fail #3