-
Bad documenttaion. not very long errors
Detecting toxicity in outputs generated by Large Language Models (LLMs) is crucial for ensuring that these models produce safe, respectful, and appropriate con…
-
how we record & transcribe now:
1. record chunk of audio of 30s on each device
2. use local voice activity detection model to extract speech frames, if not enough, skip transcription
3. transcrib…
-
This interface works fine for the onset of valid speech, but it has a delay about 100 ms for the offset of speech.
For the same frame, the function 'vad.is_speech' returns different decisions when …
-
Bad documenttaion. not very long errors
Detecting toxicity in outputs generated by Large Language Models (LLMs) is crucial for ensuring that these models produce safe, respectful, and appropriate con…
-
* maybe in two years :D
* should be local
-
https://github.com/matatonic/openedai-speech/blob/09b1c051e10cfbbbb48bdbff09ffa71536c2c8d4/docker-compose.yml#L13
Hi, just adding `runtime: nvidia` under this line fixed my nvidia gpu detection iss…
-
4080显卡,速度可能不到原来的1%,堪比用CPU跑。但看显卡占用又跑满了,找不到原因。是否没有正确调用到打包里的PyTorch和TensorFlow所致?
fasterwhispergui.log如下:
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and …
-
The last updates to speech recognizers in ART were made in Nov 2022. Since then, changes to ART have made them incompatible with many of their downstream tasks.
The proposed solution: update or rew…
-
When I am using the pipeline, I meet an error: KeyError: "Unknown task depth-estimation, available tasks are ['audio-classification', 'automatic-speech-recognition', 'conversational', 'feature-extrac…
-
Integrate [speech_to_text.py](https://github.com/kinshukgoel4/Metadata-Engine/blob/main/speech_to_text.py) in [main.py](https://github.com/kinshukgoel4/Metadata-Engine/blob/main/main.py) file