RageAgainstThePixel / com.rest.elevenlabs

A non-official Eleven Labs voice synthesis client for Unity (UPM)
https://elevenlabs.io/?from=partnerbrown9849
MIT License
74 stars 9 forks source link

Arabic Encoding #47

Closed aynader closed 8 months ago

aynader commented 9 months ago

It appears that the API doesn't encode Arabic correctly, Arabic text generated through the API is 'distorted' while Arabic text generated on the website, works as intended.

StephenHodgson commented 9 months ago

@aynader can you provide me with some example text please?

aynader commented 9 months ago

sure!

"أنا مش عارفني، أنا تهت مني، أنا مش أنا"

"لا دي ملامحي ولا شكلي، شكلي ولا دا أنا"

"أبص لروحي فجأة لقيتني"

"لقيتني كبرت، فجأة كبرت"

"تعبت من المفاجأة ونزلت دمعتي"

"قولي لي إيه يا مرايتي"

"قولي لي إيه حكايتي تكونشي"

"تكونشي دي نهايتي، وآخر قصتي"

aynader commented 9 months ago

I just figured it out. You just need to update this, in TextToSpeechEndpoint.cs :

var request = new TextToSpeechRequest(text, model, defaultVoiceSettings);

to:

byte[] utf8Bytes = Encoding.UTF8.GetBytes(text);
string encodedText = Encoding.UTF8.GetString(utf8Bytes);
var request = new TextToSpeechRequest(encodedText, model, defaultVoiceSettings);

I tested it and it works flawlessly.

StephenHodgson commented 9 months ago

I wouldn't necessarily consider this a bug with the library itself, but good to know that text may need to be reencoded before sending.

StephenHodgson commented 9 months ago

I'll keep this open a bit longer as I do more research on what to do for next steps.

StephenHodgson commented 8 months ago

Latest version gives you more power to specify encoding if needed. Tested this out and seemed to work ok, but would like validation with native speaker.