julius-speech / julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine
BSD 3-Clause "New" or "Revised" License
1.84k stars 299 forks source link

palles77 need your help #149

Open Valery813 opened 4 years ago

Valery813 commented 4 years ago

Dear palles77, you have trained a language model for Polish and English. Share with me the documentation or a sequence of actions on how to do it yourself. To make it clear to the novice. Files *.deduped is? what to do with them. I want to get a ready-made model for Julius or Kaldi.

Valery813 commented 4 years ago

Maybe there is a training video ?

palles77 commented 4 years ago

Hi Valery. I appreciate your enthusiasm with willing to learn about creation of language models. It is a fairly involved process. I haven't got any video or documentation for that matter as everything is in my head. You probably should start by buying some books and learning about language models. There is quite a lot information about that on the Internet.

Valery813 commented 4 years ago

I understand, I will search for information. Thanks

palles77 commented 3 years ago

I will soon publish these in a separate repo. Stay tuned.

Valery813 commented 3 years ago

This is excellent

Valery813 commented 3 years ago

Hello, have you done any transcribing with KALDI? I tried to build the SWE model, a lot of errors in the scripts.

palles77 commented 3 years ago

No. Its too steep learning curve. My training scripts for Julius allow state of the art results. They are probably the best training scripts out there. I will post them soon in a separate repo.

Valery813 commented 3 years ago

Hello, what language models will there be?

palles77 commented 3 years ago

It will be for English. But I am planning to start doing releases for other languages.

Valery813 commented 3 years ago

Hi, I'm waiting for the release of your model. How to lay out, tell me. ok? I made a model of the Swedish and Danish language for KALDI, the result of transcribing is sad. (%WER = ~5-10%)

palles77 commented 3 years ago

I am slowly starting to prepare the stuff. Initial repo https://github.com/palles77/htk-cuda. More to follow soon.

Valery813 commented 3 years ago

Hi, how are you doing with the English model? I tested the aspire and the vosk model alphacep. The results are not very good WER ~ 50%

palles77 commented 3 years ago

I am progressing this work. I will soon be releasing an upgraded version of Julius. Then I will focus on releasing training procedures for English language. You can always contact me on private email: silesiaresearch at gmail dot com

Valery813 commented 3 years ago

How are you doing, do you have anything to test ?)))