alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

The pretrained Chinese model can not process audio file with 48khz sample rate #242

Open KawhiZhao opened 4 years ago

KawhiZhao commented 4 years ago

Here is the worker.log

2020-08-03 06:30:54 - INFO: __main__: Opening websocket connection to master server 2020-08-03 06:30:54 - INFO: __main__: Opened websocket connection to server 2020-08-03 06:30:54 - DEBUG: __main__: <undefined>: Got message from server of type <class 'ws4py.messaging.TextMessage'> 2020-08-03 06:30:54 - INFO: decoder2: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Initializing request 2020-08-03 06:30:54 - INFO: __main__: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Started timeout guard 2020-08-03 06:30:54 - DEBUG: __main__: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Checking that decoder hasn't been silent for more than 10 seconds 2020-08-03 06:30:54 - INFO: __main__: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Initialized request 2020-08-03 06:30:54 - DEBUG: __main__: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Got message from server of type <class 'ws4py.messaging.BinaryMessage'> 2020-08-03 06:30:54 - DEBUG: decoder2: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Pushing buffer of size 8000 to pipeline 2020-08-03 06:30:54 - DEBUG: decoder2: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Pushing buffer done 2020-08-03 06:30:54 - ERROR: decoder2: (GLib.Error('GStreamer encountered a general stream error.', 'gst-stream-error-quark', 1), 'gstwavparse.c(1684): gst_wavparse_stream_headers (): /GstPipeline:pipeline0/GstDecodeBin:decodebin/GstWavParse:wavparse0:\nStream claims av_bsp = 192000, which is more than 96000 - invalid data') 2020-08-03 06:30:54 - INFO: decoder2: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Resetting decoder state 2020-08-03 06:30:54 - DEBUG: ws4py: Closing message received (1000) '' 2020-08-03 06:30:54 - DEBUG: __main__: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Websocket closed() called 2020-08-03 06:30:54 - DEBUG: __main__: 4e4cad05-9440-4aaf-9bc7-27ed656d5a7b: Websocket closed() finished DEBUG 2020-08-03 06:31:01,885 Starting up worker

I wonder do I need to resample the audio file to 16khz?

naxingyu commented 4 years ago

yes, you do.