-
If I try to OCR images that are kinda like this.
![0_00_16_916__0_00_20_486_3000000000640010606400120](https://github.com/cloudy-sfu/GUI-for-tesseract-OCR/assets/129892077/c1e21939-50ef-4109-b511-667…
-
**Describe the bug**
A call to `SpeechSynthesizer.StopSpeakingAsync()` does not stop synthesis for a very long time, up to 30 seconds. The log file is here: [speech.log](https://github.com/Azure-Sa…
-
1. I coverted 2 hour long japanese movie to mp3 file
2. I launched app with model : 'ggml-medium.bin', GPU
3. I set output and 'transcribe'
4. It works well but in the middle of processing all dial…
-
@ankanbhunia
While you mention:
"You can train the model in any custom dataset other than IAM and CVL. The process involves creating a dataset_name.pickle file and placing it inside files folder.…
-
Steps to reproduce
------------------
Windows.
Downloaded the latest release, already have ffmpeg installed.
Transcription Language: Swedish
Audio source: file (file.mkv)
Transcription metho…
-
You can easily test out sentences like the following:
`The name of that celebrity is 王菲`
everything will be classified as English (you can try any Chinese name or any English prefix sentence, it w…
-
Great job for this toolkit .
I'm attempting to merge two models with differing `vocab_size`: `augmxnt/shisa-7b-v1` (base) and `teknium/OpenHermes-2.5-Mistral-7B`. The `augmxnt/shisa-7b-v1` model ha…
-
Hey there Konstantin
currently i use a branch of whisper that uses a VAD, which produces great results with Japanese language,
Im really impressed with your program here and the ability to use …
-
Hi! Thank you for the awesome model!
We are very interested in your project and we try to use the sew for Japanese Language.
When we train the model, should we use these scripts? Thanks!
https://…
-
Dear All,
I would like to recognize Taiwanese Hakka speech using fine-tuned Whisper. However, Hakka is not supported by WhisperTokenizer. Any idea?
Here is my code and log:
```
ngpu=10 # nu…