modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
5.97k stars 647 forks source link

请问关于websockt,想自己用php实现client读取wav,传给 websocketmain需要传输什么样的数据格式? #625

Closed mdys closed 1 year ago

mdys commented 1 year ago
OScentos7.9
python==3.7.16
torch==1.11.0+cu102
modelscope-1.6.1
funasr-0.6.0
onnxruntime 1.14.0
g++ (GCC) 11.2.1
cmake version 3.26.3
websocket-cpp
模型:speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch

感谢大佬,我是技术渣,不会C++, 因为我这现有平台是基于php-swoole做的,想实现多并发websocket asr。 试了下 wss_client_asr.py 和websocketclient (容易 core dumped) 都不能达到满意效果。 想用php方法实现一个 wss_client,但不知道传给websocketmain 什么样的数据可以正确识别? 请教下能否给个示例传值? 我用php模仿python客户端做的,无法做到和python结果一致,导致websocketmain 不能读取。迷茫了。感谢大佬帮忙看看。。

run(function () {
    $client = new Swoole\Coroutine\Http\Client('127.0.0.1', 8889);
   $client->upgrade('/');    
    echo "连接服务器成功\n";
    // 设置binaryType为blob
    $client->set(['websocket_mask' => true]);
    // 读取WAV文件数据
    $fileData = file_get_contents('22.wav');
    // WAV文件头部字节数
    $headerSize = 44;
    // 采样率
    $sampleRate = unpack('V', substr($fileData, 24, 4))[1];
    // 每个样本的字节数
    $bytesPerSample = unpack('v', substr($fileData, 34, 2))[1] / 8;
    // 每个通道的样本数
    $samplesPerChannel = unpack('V', substr($fileData, 40, 4))[1] / $bytesPerSample;
    // 计算每个480毫秒的样本数
    $samplesPer480ms = intval(0.48 * $sampleRate);
    // 计算每个480毫秒的字节数
    $bytesPer480ms = $samplesPer480ms * $bytesPerSample;
    // 计算需要分割的片段数量
    $totalFragments = ceil(strlen($fileData) - $headerSize) / $bytesPer480ms;    
    echo "$sampleRate\n";
    echo "$bytesPerSample\n";
    echo "$samplesPerChannel\n";
    echo "$samplesPer480ms\n";
    echo "$bytesPer480ms\n";
    echo "$totalFragments\n";
    // 分割并发送数据
    for ($i = 0; $i < $totalFragments; $i++) {
        $offset = $headerSize + ($i * $bytesPer480ms);
        $fragment = substr($fileData, $offset, $bytesPer480ms);     
        $utf8Data = mb_convert_encoding($fragment,'UTF-8','binary');
        // 发送二进制数据
        $client->push($utf8Data,true);
        //echo $fragment."\n";       
    }
    // 接收数据
    while (true) {
        $data = $client->recv();
        if ($data === false) {
            echo "接收数据失败\n";
            break;
        }
        if ($data === '') {
            echo "与服务器断开连接\n";
            break;
        }
        echo "接收到服务器数据:$data\n";
    }
    // 关闭连接
    $client->close();
});

websocketmain 不是返回 “Invalid UTF8 encoding” 就是 “Clients may not send unmasked frames” 要么就是直接 core dumped

/home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --quantize true --vad-dir /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --vad-quant true --punc-dir /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch --punc-quant true --port 8889 --decoder_thread_num 4 --io_thread_num 1
I20230613 12:59:47.235261  2799 websocketmain.cpp:20] model-dir : /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
I20230613 12:59:47.235338  2799 websocketmain.cpp:20] quantize : true
I20230613 12:59:47.235368  2799 websocketmain.cpp:20] vad-dir : /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch
I20230613 12:59:47.235375  2799 websocketmain.cpp:20] vad-quant : true
I20230613 12:59:47.235380  2799 websocketmain.cpp:20] punc-dir : /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch
I20230613 12:59:47.235384  2799 websocketmain.cpp:20] punc-quant : true
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20230613 12:59:47.255167  2799 fsmn-vad.cpp:58] Successfully load model from /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/model_quant.onnx
model ready
asr model init finished. listen on port:8889
on_open, active connections: 1
[2023-06-13 12:59:52] [error] consume error: websocketpp.processor:12 (Clients may not send unmasked frames)
on_close, active connections: 0
on_open, active connections: 1
terminate called without an active exception
Aborted (core dumped)
zhaomingwork commented 1 year ago

@mdys 需要完全参考python发送流程,大概步骤第一步发json,第二步发二进制数据 ,最后事件接受消息。php不是很熟,看了下,感觉第一步没发json, 第二步发送的是utf8编码不是原始二进制,最后消息接收要定义事件而不是循环接收。

mdys commented 1 year ago

ok 我再试试。 看了一晚上 头蒙蒙的。

LauraGPT commented 1 year ago

Please ref to websocket_protocol