Open csukuangfj opened 1 year ago
@csukuangfj thanks for your work.
To evaluate models it would be great to implement android tts engine api.
I would do it myself after i finish all work-related things, but i am saying in case someone could do it before.
@csukuangfj Sorry for my noob question. When I convert a fine-tuned coqui tts vits-ljs model to onnx using the code here, i only got the onnx export. The android code expects tokens.txt and lexicon. How can i get those too? and will this work with coqui tts models?
@anita-smith1
Can you find the code about how to convert a word to phonemes for vits models from coqui?
If you can provide that, I can provide scripts to generate lexicon.txt and tokens.txt and also model.onnx from coqui.
@anita-smith1
I just managed to convert vits models from coqui to sherpa-onnx.
Will post a colab notebook to show you how to do that.
You can use the converted model in sherpa-onnx, e.g., build an Android App with sherpa-onnx.
@anita-smith1
@nanaghartey
I just created a colab notebook to show how to export the VITS models from https://github.com/coqui-ai/TTS to onnx and how to generate tokens.txt and lexicon.txt so that you can use the exported model with sherpa-onnx.
Please see https://colab.research.google.com/drive/1cI9VzlimS51uAw4uCR-OBeSXRPBc4KoK?usp=sharing
For those of you who are interested in converting piper models to sherpa-onnx, please have a look at the following colab notebook:
https://colab.research.google.com/drive/1PScLJV3sbUUAOiptLO7Ixlzh9XnWWoYZ?usp=sharing
@csukuangfj Thanks for sharing how to export. I run your colab notebook without using my model. Everything is generated but when i use it in the sherpa-onnx tts android demo app, it crashes. The app loads the model, lexicon and tokens all right but when you tap "Generate" the app crashes with:
2023-11-10 17:39:10.825 18430-18430 sherpa-onnx com.k2fsa.sherpa.onnx W string is: how are you doing
2023-11-10 17:39:10.825 18430-18430 sherpa-onnx com.k2fsa.sherpa.onnx W Raw text: how are you doing
2023-11-10 17:39:10.826 18430-18430 libc com.k2fsa.sherpa.onnx A Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x38333501 in tid 18430 (fsa.sherpa.onnx), pid 18430 (fsa.sherpa.onnx)
2023-11-10 17:39:11.148 18526-18526 DEBUG pid-18526 A Cmdline: com.k2fsa.sherpa.onnx
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A pid: 18430, tid: 18430, name: fsa.sherpa.onnx >>> com.k2fsa.sherpa.onnx <<<
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #00 pc 00000000003d65f8 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libonnxruntime.so (BuildId: 5bc662f139575789b099c8c4823a8ed1565a0d4c)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #01 pc 00000000003a9404 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libonnxruntime.so (BuildId: 5bc662f139575789b099c8c4823a8ed1565a0d4c)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #02 pc 000000000009f914 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (Ort::detail::SessionImpl<OrtSession>::Run(Ort::RunOptions const&, char const* const*, Ort::Value const*, unsigned long, char const* const*, unsigned long)+204) (BuildId: c9b089160ce1cb938a4794e9c94d3a1636656712)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #03 pc 0000000000124ba4 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (sherpa_onnx::OfflineTtsVitsModel::Impl::RunVits(Ort::Value, long, float)+1128) (BuildId: c9b089160ce1cb938a4794e9c94d3a1636656712)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #04 pc 0000000000123f9c /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (sherpa_onnx::OfflineTtsVitsModel::Impl::Run(Ort::Value, long, float)+96) (BuildId: c9b089160ce1cb938a4794e9c94d3a1636656712)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #05 pc 0000000000123ec0 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (sherpa_onnx::OfflineTtsVitsModel::Run(Ort::Value, long, float)+48) (BuildId: c9b089160ce1cb938a4794e9c94d3a1636656712)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #06 pc 00000000001220a4 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (sherpa_onnx::OfflineTtsVitsImpl::Generate(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, long, float) const+1128) (BuildId: c9b089160ce1cb938a4794e9c94d3a1636656712)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #07 pc 000000000000a1a8 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-jni.so (Java_com_k2fsa_sherpa_onnx_OfflineTts_generateImpl+316) (BuildId: 7c35f6abaaa0600b2d69ed7bf0aa62b69c017daa)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #14 pc 0000000000002278 [anon:dalvik-classes3.dex extracted in memory from /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/base.apk!classes3.dex] (com.k2fsa.sherpa.onnx.OfflineTts.generate+0)
2023-11-10 17:39:11.150 18526-18526 DEBUG pid-18526 A #19 pc 00000000000012a0 [anon:dalvik-classes3.dex extracted in memory from /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/base.apk!classes3.dex] (com.k2fsa.sherpa.onnx.MainActivity.onClickGenerate+0)
2023-11-10 17:39:11.150 18526-18526 DEBUG pid-18526 A #24 pc 0000000000001544 [anon:dalvik-classes3.dex extracted in memory from /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/base.apk!classes3.dex] (com.k2fsa.sherpa.onnx.MainActivity.onCreate$lambda$0+0)
2023-11-10 17:39:11.150 18526-18526 DEBUG pid-18526 A #29 pc 0000000000001200 [anon:dalvik-classes3.dex extracted in memory from /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/base.apk!classes3.dex] (com.k2fsa.sherpa.onnx.MainActivity.$r8$lambda$OIkLpaHjEAmudVQGZZp-NNJ9rrA+0)
2023-11-10 17:39:11.150 18526-18526 DEBUG pid-18526 A #34 pc 00000000000011ac [anon:dalvik-classes3.dex extracted in memory from /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/base.apk!classes3.dex] (com.k2fsa.sherpa.onnx.MainActivity$$ExternalSyntheticLambda0.onClick+0)
2023-11-10 17:39:11.150 18526-18526 DEBUG pid-18526 A #49 pc 00000000003074bc [anon:dalvik-classes.dex extracted in memory from /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/base.apk] (com.google.android.material.button.MaterialButton.performClick+0)
---------------------------- PROCESS ENDED (18430) for package com.k2fsa.sherpa.onnx ----------------------------
2023-11-10 17:39:11.408 1190-1226 WindowManager pid-1190 E win=Window{a478533 u0 com.k2fsa.sherpa.onnx/com.k2fsa.sherpa.onnx.MainActivity EXITING} destroySurfaces: appStopped=false cleanupOnResume=false win.mWindowRemovalAllowed=true win.mRemoveOnExit=true win.mViewVisibility=0 caller=com.android.server.wm.ActivityRecord.destroySurfaces:6536 com.android.server.wm.ActivityRecord.destroySurfaces:6517 com.android.server.wm.WindowState.onExitAnimationDone:5966 com.android.server.wm.ActivityRecord$$ExternalSyntheticLambda10.accept:2 java.util.ArrayList.forEach:1528 com.android.server.wm.ActivityRecord.onAnimationFinished:8605 com.android.server.wm.ActivityRecord.postApplyAnimation:6250
This is the model.onnx, lexicon.txt and token.txt generated from your notebook - https://drive.google.com/file/d/1ndZ5MSyS8482Eht6IP1jqxsMwIvB9Ln6/view?usp=sharing
When running your notebook, this was the only error i encountered but i'm not sure that matters:
%%shell
pip install -q TTS:
Preparing metadata (setup.py) ... done
......
Building wheel for encodec (setup.py) ... done
Building wheel for umap-learn (setup.py) ... done
Building wheel for bnnumerizer (setup.py) ... done
Building wheel for bnunicodenormalizer (setup.py) ... done
Building wheel for docopt (setup.py) ... done
Building wheel for gruut-ipa (setup.py) ... done
Building wheel for gruut_lang_de (setup.py) ... done
Building wheel for gruut_lang_en (setup.py) ... done
Building wheel for gruut_lang_es (setup.py) ... done
Building wheel for gruut_lang_fr (setup.py) ... done
Building wheel for pynndescent (setup.py) ... done
Building wheel for gruut (setup.py) ... done
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
plotnine 0.12.4 requires numpy>=1.23.0, but you have numpy 1.22.0 which is incompatible.
tensorflow 2.14.0 requires numpy>=1.23.5, but you have numpy 1.22.0 which is incompatible.
This android crash occurs when i export and use my fine-tuned coqui tts model too.
The app loads the model, lexicon and tokens all right but when you tap "Generate" the app crashes with:
@anita-smith1
Could you show more logs, something like below:
13:48.972 19945-19945 sherpa-onnx com.k2fsa.sherpa.onnx I Start to initialize TTS
2023-11-11 11:13:49.063 19945-19945 sherpa-onnx com.k2fsa.sherpa.onnx W config:
OfflineTtsConfig(model=OfflineTtsModelConfig(vits=OfflineTtsVitsModelConfig(model="vits-coqui-en-vctk/model.onnx", lexicon="vits-coqui-en-vctk/lexicon.txt", tokens="vits-coqui-en-vctk/tokens.txt", noise_scale=0.667, noise_scale_w=0.8, length_scale=1), num_threads=2, debug=True, provider="cpu"), rule_fsts="")
2023-11-11 11:14:07.089 19945-19945 sherpa-onnx com.k2fsa.sherpa.onnx W ---vits model---
punctuation=; : , . ! ? ¡ ¿ — … " « » “ ”
add_blank=1
sample_rate=22050
language=English
n_speakers=109
comment=coqui
model_type=vits
I need to see the model meta data output.
@csukuangfj Superr!
For converting piper models to sherpa-onnx using the provided colab notebook, i can confirm that it works on SherpaOnnxTTS
but for converting coqui tts vits model to sherpa onnx, using same android code, the android app crashes on all test devices
@nanaghartey
Could you post some logcat output for the crash?
@anita-smith1
From you error log,
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #01 pc 00000000003a9404 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libonnxruntime.so (BuildId: 5bc662f139575789b099c8c4823a8ed1565a0d4c)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #02 pc 000000000009f914 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (Ort::detail::SessionImpl<OrtSession>::Run(Ort::RunOptions const&, char const* const*, Ort::Value const*, unsigned long, char const* const*, unsigned long)+204) (BuildId: c9b089160ce1cb938a4794e9c94d3a1636656712)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #03 pc 0000000000124ba4 /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (sherpa_onnx::OfflineTtsVitsModel::Impl::RunVits(Ort::Value, long, float)+1128) (BuildId: c9b089160ce1cb938a4794e9c94d3a1636656712)
2023-11-10 17:39:11.149 18526-18526 DEBUG pid-18526 A #04 pc 0000000000123f9c /data/app/~~VjPnUWg1_JmIAZyAeNh2fA==/com.k2fsa.sherpa.onnx-kZTE0P4iaYzmG76-wTSUZQ==/lib/arm64/libsherpa-onnx-core.so (sherpa_onnx::OfflineTtsVitsModel::Impl::Run(Ort::Value, long, float)+96) (BuildId:
It crashes in the function
sherpa_onnx::OfflineTtsVitsModel::Impl::RunVits
which does not look right. I think there is something wrong in the comment
field of your model meta data.
It should be coqui
or piper
for models from coqui and piper.
@csukuangfj
The app loads the model, lexicon and tokens all right but when you tap "Generate" the app crashes with:
@anita-smith1
Could you show more logs, something like below:
13:48.972 19945-19945 sherpa-onnx com.k2fsa.sherpa.onnx I Start to initialize TTS 2023-11-11 11:13:49.063 19945-19945 sherpa-onnx com.k2fsa.sherpa.onnx W config: OfflineTtsConfig(model=OfflineTtsModelConfig(vits=OfflineTtsVitsModelConfig(model="vits-coqui-en-vctk/model.onnx", lexicon="vits-coqui-en-vctk/lexicon.txt", tokens="vits-coqui-en-vctk/tokens.txt", noise_scale=0.667, noise_scale_w=0.8, length_scale=1), num_threads=2, debug=True, provider="cpu"), rule_fsts="") 2023-11-11 11:14:07.089 19945-19945 sherpa-onnx com.k2fsa.sherpa.onnx W ---vits model--- punctuation=; : , . ! ? ¡ ¿ — … " « » “ ” add_blank=1 sample_rate=22050 language=English n_speakers=109 comment=coqui model_type=vits
I need to see the model meta data output.
This is my log. Comment field says coqui but when I compare to yours I can see the numofspeakers is 0 for mine:
2023-11-11 04:17:49.611 14902-14902 sherpa-onnx com.k2fsa.sherpa.onnx I Start to initialize TTS
2023-11-11 04:17:49.701 14902-14902 sherpa-onnx com.k2fsa.sherpa.onnx W config:
OfflineTtsConfig(model=OfflineTtsModelConfig(vits=OfflineTtsVitsModelConfig(model="original_colab_model/model.onnx", lexicon="original_colab_model/lexicon.txt", tokens="original_colab_model/tokens.txt", noise_scale=0.667, noise_scale_w=0.8, length_scale=1), num_threads=2, debug=True, provider="cpu"), rule_fsts="")
2023-11-11 04:17:51.634 14902-14902 libc com.k2fsa.sherpa.onnx W Access denied finding property "ro.mediatek.platform"
2023-11-11 04:18:00.766 14902-14902 sherpa-onnx com.k2fsa.sherpa.onnx W ---vits model---
punctuation=; : , . ! ? ¡ ¿ — … " « » “ ”
add_blank=1
sample_rate=22050
language=English
n_speakers=0
comment=coqui
model_type=vits
2023-11-11 04:18:06.575 14902-14902 sherpa-onnx com.k2fsa.sherpa.onnx I Finish initializing TTS
2023-11-11 04:18:06.657 14902-14902 MSHandlerLifeCycle com.k2fsa.sherpa.onnx I check: return. pkg=com.k2fsa.sherpa.onnx parent=null callers=com.android.internal.policy.DecorView.setVisibility:4411 android.app.ActivityThread.handleResumeActivity:5476 android.app.servertransaction.ResumeActivityItem.execute:54 android.app.servertransaction.ActivityTransactionItem.execute:45 android.app.servertransaction.TransactionExecutor.executeLifecycleState:176
2023-11-11 04:18:06.657 14902-14902 MSHandlerLifeCycle com.k2fsa.sherpa.onnx I removeMultiSplitHandler: no exist. decor=DecorView@c4a4467[]
2023-11-11 04:18:06.684 14902-14945 NativeCust...ncyManager com.k2fsa.sherpa.onnx D [NativeCFMS] BpCustomFrequencyManager::BpCustomFrequencyManager()
2023-11-11 04:18:06.716 14902-14902 InsetsController com.k2fsa.sherpa.onnx D onStateChanged: InsetsState: {mDisplayFrame=Rect(0, 0 - 1920, 1200),
@anita-smith1
Are you using the latest master of sherpa-onnx or the version >= v1.8.9?
@anita-smith1
Are you using the latest master of sherpa-onnx or the version >= v1.8.9?
I am using version 1.8.7. Same version used in the apks you originally shared
@anita-smith1
Are you using the latest master of sherpa-onnx or the version >= v1.8.9?
I just tried v1.8.9 and it worked :)
@anita-smith1 Are you using the latest master of sherpa-onnx or the version >= v1.8.9?
I just tried v1.8.9 and it worked :)
Glad to hear that it works for you.
By the way, we have pre-built Android APKs for the VITS English models from Coqui.
@anita-smith1 Are you using the latest master of sherpa-onnx or the version >= v1.8.9?
I just tried v1.8.9 and it worked :)
Glad to hear that it works for you.
By the way, we have pre-built Android APKs for the VITS English models from Coqui.
Thank you very much for your amazing support. It's incredible!
Now android demo works with my fine-tuned coqui models too. By the way, in case I want to use Java instead of kotlin for android, would I need to build from source? There seems to be only kotlin support for android at the moment.
Also are there plans to release same on-device tts for iOS too?
By the way, in case I want to use Java instead of kotlin for android, would I need to build from source?
Sorry that we only have Kotlin for Android demo. But the jni interface can also be used in Java.
You can reuse the .so
files for java.
You can build sherpa-onnx from source for Android by following https://k2-fsa.github.io/sherpa/onnx/android/index.html if you want to use the latest master of sherpa-onnx.
Also are there plans to release same on-device tts for iOS too?
Yes, it is in the plan. We have already supported speech-to-text on iOS. Adding text-to-speech to iOS is very easy with our current code in sherpa-onnx. We will do that in the coming week.
@csukuangfj That's great to hear!
I have another noob question:
I have a fine tuned coqui tts vits model that contains non-English words. I use a custom CMU.in.IPA.txt and custom all-english-words.txt file for the non-English words (with few English words). When I synthesise using the cell that contains this code:
......
def main():
model = OnnxModel("./model.onnx")
text = "xse wo atua de a fa"
x = vits.tokenizer.text_to_ids(text, vits.tokenizer)
x = torch.tensor(x, dtype=torch.int64)
y = model(x)
print(y.shape)
soundfile.write("test.wav", y.numpy(), model.sample_rate)
main()
Everything works. The pronunciation is good. However when I use:
sherpa-onnx-offline-tts \
--vits-model=./model.onnx \
--vits-lexicon=./lexicon.txt \
--vits-tokens=./tokens.txt \
--output-filename=./test.wav \
"xse wo atua de a fa"
I get unexpected results - the pronunciations are wrong. Even for the few English words.
This is a sample of my ipa.txt file which contains both non-english and few English words:
a, ʌ
atua, ejtujʌ
call, kɔˈl
de, ðʌ
din, diˈn
edin, ejdiˈn
fa, fʌ
frq, fɹɛ
line, lajˈn
mma, mɑˈ
mobile, mowˈbʌl
na, nɑˈ
naa, nɑˈɑˈ
ndrope, ɪŋdɹɑˌpi
ne, nɪ
no, nu
nreflecte, ɪŋɹʌflɛˈktɪ
nxma, nɔmbɚ
all-english-words.txt contains all the words. I followed the same format used in the original English list and I used a phoneme to IPA converter.
What do I need to do to make this work? Thank you
I used a phoneme to IPA converter.
Please add your words to all-english-words.txt
and let the code generate the pronunciations for you.
The pronunciations in CMU.in.IPA.txt
are discarded and are never used. Only the first column, i.e., the words,
is used in CMU.in.IPA.txt
.
Please don't use an IPA converter to generate pronunciations for your new words.
The code uses get_phones
in one cell from the colab to generate pronunciations for a given word.
@csukuangfj you are right! It's better now, though there is still some difference ( a slight loss in pronunciation compared to the original coqui model).
Is the order of the words I add important? I added them at the bottom of all-english-words.txt
file. Also, is anything else I can do to improve the pronunciations on sherpa-onnx?
Hi @csukuangfj, Why not make an Android port of piper_phonemize and use it in next gen TTS instead of a lexicon? These voices could be used in a screen reader in the future, and there will be many words will try to read that may not be in that lexicon.
Why not make an Android port of piper_phonemize and use it in next gen TTS instead of a lexicon?
We try to support as many models as possible, including models not from piper, which may not use piper_phonemize.
there will be many words will try to read that may not be in that lexicon
We can add OOV words to the lexicon.txt manually. By the way, the lexicon.txt already covers lots of words.
Is the order of the words I add important?
The order does not matter as long as you don't add duplicate words.
If you add duplicate words, then the first one has a higher priority and later ones are ignored.
is anything else I can do to improve the pronunciations on sherpa-onnx?
I realized that a word that appears in a sentence with other words can have a different pronunciation from when it appears standalone. I am afraid it is hard, if not impossible, to improve the pronunciations with the current approach.
I am not sure you can cover everything.
It might be better to make a condition and use piper_phonemize for piper models.
Yes, this will result in adding espeak-ng, but it will be much better than adding words manually
On 11/14/23, Fangjun Kuang @.***> wrote:
Why not make an Android port of piper_phonemize and use it in next gen TTS instead of a lexicon?
We try to support as many models as possible, including models not from piper, which may not use piper_phonemize.
there will be many words will try to read that may not be in that lexicon
We can add OOV words to the lexicon.txt manually. By the way, the lexicon.txt already covers lots of words.
-- Reply to this email directly or view it on GitHub: https://github.com/rhasspy/piper/issues/257#issuecomment-1809946186 You are receiving this because you commented.
Message ID: @.***>
-- with best regards Beqa Gozalishvili Tell: +995593454005 Email: @.*** Web: https://gozaltech.org Skype: beqabeqa473 Telegram: https://t.me/gozaltech facebook: https://facebook.com/gozaltech twitter: https://twitter.com/beqabeqa473 Instagram: https://instagram.com/beqa.gozalishvili
Maybe a better approach is to change the modeling unit from phones to other units that can be entirely derived from words.
@csukuangfj true, single words seem to have poor pronunciations compared to same words in phrases. However, the fact that this tts solution works offline is amazing. By the way can inference be done on the GPU? And importantly, has the iOS version been released?
And importantly, has the iOS version been released
I am writing the code now. Please wait a day or two.
@csukuangfj wow can't wait :) your work is amazing
@anita-smith1
The iOS demo is ready now.
You can run text-to-speech with Next-gen Kaldi on iOS using https://github.com/k2-fsa/sherpa-onnx/pull/443
I have recorded a video showing how to use it. Please see https://www.youtube.com/watch?v=MvePdkuMNJk
single words seem to have poor pronunciations compared to same words in phrases
Don't worry. I will try to fix it.
@csukuangfj incredible. that was quick! The video demo is fantastic. Great Job! but for me seems I'd have to wait since my project does not use swiftUI. It only uses uikit. I can see you have this to help us build the android from source. Can you add documentation for iOS too? so we can build our iOS too from source
Please see https://k2-fsa.github.io/sherpa/onnx/ios/build-sherpa-onnx-swift.html#build-sherpa-onnx-in-commandline-c-part for doc.
Thank you
Is it possible to add my custom model? Here is the link: /ivona_hq.tar.lzma
If you want to check it out, it can be unpacked with:
tar --lzma --extract --file ivona_hq.tar.lzma
Ive been wanting to use this voice but the original application is 32bit and nearly 15 years old at this point. Its being held together by threads.
is it a vits model? could you show the extracted files?
Is it possible to add my custom model? Here is the link: /ivona_hq.tar.lzma
If you want to check it out, it can be unpacked with:
tar --lzma --extract --file ivona_hq.tar.lzma
Ive been wanting to use this voice but the original application is 32bit and nearly 15 years old at this point. Its being held together by threads.
@sweetbbak
I just looked at the model and found that it is piper-based, so it is definitely supported by sherpa-onnx.
I have converted the model. You can find it at https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models
I have also added it to https://huggingface.co/spaces/k2-fsa/text-to-speech
For the following text:
“Today as always, men fall into two
groups: slaves and free men. Whoever does not
have two-thirds of his day for himself, is a slave, whatever he
may be: a statesman, a businessman, an official, or a scholar.”
It generates the following audio:
https://github.com/rhasspy/piper/assets/5284924/9a2d2869-5272-4eb0-8f28-c705fbee4c05
By the way, you can also find the exported model at https://huggingface.co/csukuangfj/vits-piper-en_US-sweetbbak-amy/tree/main
The above repo also contains the script for exporting.
@sweetbbak
The audio sounds like British English, but the JSON config file says the voice is en-us
. Is there something wrong?
Thank you so much! Thats my mistake, more than likely. I believe I used en_US-amy as a base to fine-tune off of because it sounded better. So I was unsure if it should be marked as en_US or en_GB. Its definitely a British voice.
Thank you so much! Thats my mistake, more than likely. I believe I used en_US-amy as a base to fine-tune off of because it sounded better. So I was unsure if it should be marked as en_US or en_GB. Its definitely a British voice.
I am changing it to en_GB
.
I appreciate it.
FYI:
No lexicon.txt is required any longer. We are also using piper-phonemize in sherpa-onnx.
You can find all models from piper in sherpa-onnx at https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models
@beqabeqa473
@csukuangfj thanks for your work.
To evaluate models it would be great to implement android tts engine api.
Im very new to working with android let alone with tts apis. Would it be possible to lead me in the right direction to trying to develope a android tts engine api? I have found some examples of offline-tts though im trying to figure how convert this into a engine api. I was thinking some of these would be useful ( TTS service wrapper , TTS-engine ), I think i just need some guidance as I am lost in a sea of knowledge. Let me know what you think.
TTS-engine
Please wait for a moment. I have managed to create a tts engine service that you can use to replace the system tts engine.
I will create a PR soon.
I will create a PR soon. @csukuangfj Woah that is legendary. I'd love to explore what you did to do this, I'm interested in learning. I'll hold tight. Thanks Sorry for my naivety, What does PR stand for?
I am also working on that.
I have already a working prototype of piper tts engine for android with downloading and installing voices.
I will make repo public soon
On 12/31/23, ekul_ @.***> wrote:
I will create a PR soon. @csukuangfj Woah that is legendary. I'd love to explore what you did to do this, I'm interested in learning. I'll hold tight. Thanks
-- Reply to this email directly or view it on GitHub: https://github.com/rhasspy/piper/issues/257#issuecomment-1872655597 You are receiving this because you were mentioned.
Message ID: @.***>
-- with best regards Beqa Gozalishvili Tell: +995593454005 Email: @.*** Web: https://gozaltech.org Skype: beqabeqa473 Telegram: https://t.me/gozaltech facebook: https://facebook.com/gozaltech twitter: https://twitter.com/beqabeqa473 Instagram: https://instagram.com/beqa.gozalishvili
I will create a PR soon. @csukuangfj Woah that is legendary. I'd love to explore what you did to do this, I'm interested in learning. I'll hold tight. Thanks Sorry for my naivety, What does PR stand for?
@LuekWasHere
PR is short for pull request.
I just created one at https://github.com/k2-fsa/sherpa-onnx/pull/508
You can find a YouTube video at https://www.youtube.com/watch?v=33QYuVzDORA
By the way, you can download pre-built text-to-speech engine APKs at https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html
@csukuangfj Awesome repo Fangjun Kuang ! do you know if anybody is working on a dart package for next gen Kaldi to be used on android and iOS ?
@Web3Kev
Yes, please see https://github.com/k2-fsa/sherpa-onnx/issues/379
thanks !
Now you can try piper models on your Android phones.
The following languages are supported:
Community help is appreciated to convert more models from piper to sherpa-onnx.
Please see https://k2-fsa.github.io/sherpa/onnx/tts/apk.html
Note:
You can try the models in your browser by visiting the following huggingface space https://huggingface.co/spaces/k2-fsa/text-to-speech