New features & voices - Githubissues

lainedfles commented 2 months ago

Hi, thanks for this app!

I've made a Python attempt to parse VOICES.md to output the format required for lib/data.dart. It seems to work quite well and I'd imagine wouldn't need to run often. This PR includes the latest output of the script.

The script can be executed like like the following and will parse the voice and URLs via STDIN or file. The requests library is used to connect to HuggingFace and pull the JSON configuration. Presently it's only used to determine the samplerate.

$ time python ./md-voices-to-map.py VOICES.md
<SNIP>
Map<String, String> languageCodes = {
  "ar_JO": "Arabic (\u0627\u0644\u0639\u0631\u0628\u064a\u0629)",
  "ca_ES": "Catalan (Catal\u00e0)",
  "cs_CZ": "Czech (\u010ce\u0161tina)",
  "cy_GB": "Welsh (Cymraeg)",
  "da_DK": "Danish (Dansk)",
  "de_DE": "German (Deutsch)",
  "el_GR": "Greek (\u0395\u03bb\u03bb\u03b7\u03bd\u03b9\u03ba\u03ac)",
  "en_GB": "English (GB)",
  "en_US": "English (US)",
  "es_ES": "Spanish (Espa\u00f1ol)",
  "es_MX": "Spanish (Espa\u00f1ol)",
  "fa_IR": "Farsi (\u0641\u0627\u0631\u0633\u06cc)",
  "fi_FI": "Finnish (Suomi)",
  "fr_FR": "French (Fran\u00e7ais)",
  "hu_HU": "Hungarian (Magyar)",
  "is_IS": "Icelandic (\u00edslenska)",
  "it_IT": "Italian (Italiano)",
  "ka_GE": "Georgian (\u10e5\u10d0\u10e0\u10d7\u10e3\u10da\u10d8 \u10d4\u10dc\u10d0)",
  "kk_KZ": "Kazakh (\u049b\u0430\u0437\u0430\u049b\u0448\u0430)",
  "lb_LU": "Luxembourgish (L\u00ebtzebuergesch)",
  "ne_NP": "Nepali (\u0928\u0947\u092a\u093e\u0932\u0940)",
  "nl_BE": "Dutch (Nederlands)",
  "nl_NL": "Dutch (Nederlands)",
  "no_NO": "Norwegian (Norsk)",
  "pl_PL": "Polish (Polski)",
  "pt_BR": "Portuguese (Portugu\u00eas)",
  "pt_PT": "Portuguese (Portugu\u00eas)",
  "ro_RO": "Romanian (Rom\u00e2n\u0103)",
  "ru_RU": "Russian (\u0420\u0443\u0441\u0441\u043a\u0438\u0439)",
  "sk_SK": "Slovak (Sloven\u010dina)",
  "sl_SI": "Slovenian (Sloven\u0161\u010dina)",
  "sr_RS": "Serbian (srpski)",
  "sv_SE": "Swedish (Svenska)",
  "sw_CD": "Swahili (Kiswahili)",
  "tr_TR": "Turkish (T\u00fcrk\u00e7e)",
  "uk_UA": "Ukrainian (\u0443\u043a\u0440\u0430\u0457\u0301\u043d\u0441\u044c\u043a\u0430 \u043c\u043e\u0301\u0432\u0430)",
  "vi_VN": "Vietnamese (Ti\u1ebfng Vi\u1ec7t)",
  "zh_CN": "Chinese (\u7b80\u4f53\u4e2d\u6587)"
} ;

real    0m13.277s
user    0m0.293s
sys     0m0.053s

This explicitly enables installation of each quality offering by concatenating the name with the " - " quality. The output result looks like:

lainedfles commented 2 months ago

Added capitalization:

Elleo commented 2 months ago

Hi @lainedfles,

This is brilliant, thanks!

I'm going to be making some changes to the voice data file soon to support some new features, I'll update your script as I go to save me having to manually update all the voices. This is very helpful!

Cheers, Mike

Elleo / pied

New features & voices #20