-
At the moment the only time you can pass a prompt is while configuring the builder.
This prevents prompting on a per-infer call, for:
- coherent continuation (e.g.: real-time speech with persistin…
-
I used my own data to train a streaming model. The recognition effect is poor when decoding.
There are two obvious problems, one is to delete words at the end, and the other is to insert multi…
-
Check this jsfiddle https://jsfiddle.net/dotpao/njjgpa5h/5/ from iOS Safari.
I've noticed that when you try to load the same audio file twice and then pause the file before playing, the onfinish call…
-
Sentences that have audio don’t show an availability speaker icon.
A search for sentences with ‘audio=yes’ returns:
• An Internal Error Has Occurred
-
Hi, thanks for these collection of scripts!
I've been trying to run your [`run_flax_speech_recognition_ctc.py`](https://github.com/sanchit-gandhi/seq2seq-speech/blob/main/run_flax_speech_recognitio…
-
@csukuangfj
The code for this client is in
https://github.com/k2-fsa/sherpa/tree/master/triton/client/client.py
After my test, if you do not use multi-process mode, that is, send data to the …
-
Hello. Some time passed and the program became quite confusing. Before you just selected a resolution and it mostly worked. NOt long ago I downloaded some videos and they had no audio by default
Fo…
-
Task :
Create an offline alternative to Google's [read along app ](https://readalong.google.com/) in Hindi. It should be able to show a set of words and be able to determine if you have spoken the …
-
# Task Name
[Task name]: Target Speaker ASR
[Description]: Given a multispeaker speech utterance, decode the text corresponding to the specified speaker.
## Task Objective
Multispeaker ASR i…
-
Recently made public:
https://openai.com/blog/whisper/
https://github.com/openai/whisper
Interesting, they have some multilingual models that can be used for multiple languages without fine tunin…