Closed sovcik closed 4 years ago
Hi sovcik, glad you find it useful.
For now, the command line doesn't support unicode characters. It is the next "big thing" I have to fix. In v1.3, I added unicode character support for text files.
Are you using text from a text file or typing it directly in the command line?
edit: Also, are you using a Slovak voice to read the text?
Oh, I didn't notice that. Only now I see that comment in your source code :-) https://github.com/p-groarke/wsay/blob/master/src_cmd/main.cpp#L21 I tried both the command line and UTF8 encoded text file. Attaching used file for reference.
Jozef.
On Tue, Apr 14, 2020 at 4:47 PM p-groarke notifications@github.com wrote:
Hi sovcik, glad you find it useful.
For now, the command line doesn't support unicode characters. It is the next "big thing" I have to fix. In v1.3, I added unicode character support for text files.
Are you using text from a text file or typing it directly in the command line?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/p-groarke/wsay/issues/2#issuecomment-613486538, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAATZSQKL2Z4QSTYCC5IBZ3RMRZO5ANCNFSM4MHR6SZA .
<?xml version="1.0" encoding="UTF-8"?>
<speak version="1.1" xmlns="http://www.w3.org/2001/10/synthesis"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
http://www.w3.org/TR/speech-synthesis11/synthesis.xsd"
xml:lang="sk-SK">
</s>
<s xml:lang="sk-SK">
<voice name="Filip">
Toto je po Slovensky: nán
</voice>
</s>
yes, I was using -v 12
, which is voice Filip
in Windows 10
btw. what is the environment/stack you are using for developing this project? I was considering helping you with that UTF8, but this does not look like VisualStudio project :-) I have quite extensive C++ experience, but for microcontrollers, so I was curious what do you use for Windows.
Haha you found the comment :)
I think I should be able to support the unicode space from utf pretty easily without having to rewrite my argument parsing lib (which I really don't want to do). What is strange is it should work with text file input, so I'll investigate what is going on there.
Thanks for offering help, I use cmake to generate the VS solution stuff. I wouldn't waste any time on this though, I'll fix it soon enough.
I'll let you know when I have a tentative fix with a build.
Cheers
Thanks!
Found this https://github.com/huangqinjin/wmain From what I read it should just "decode" UTF8 command line and pass it to the original main.c as wstring, so no arg parsing changes needed :-) Maybe it will be of some help.
On Wed, Apr 15, 2020 at 8:10 PM p-groarke notifications@github.com wrote:
Haha you found the comment :)
I think I should be able to support the unicode space from utf pretty easily without having to rewrite my argument parsing lib (which I really don't want to do). What is strange is it should work with text file input, so I'll investigate what is going on there.
Thanks for offering help, I use cmake to generate the VS solution stuff. I wouldn't waste any time on this though, I'll fix it soon enough.
I'll let you know when I have a tentative fix with a build.
Cheers
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/p-groarke/wsay/issues/2#issuecomment-614195473, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAATZSTICSFFKT3YHU4N67DRMX2CBANCNFSM4MHR6SZA .
Yep, I was looking at that as well 😊 If I use that, I have to adapt all the argument parsing to template out char
and wchar_t
.
Another option I’ve used in the past is setting a UTF8 mode in the command tool. Since utf8 is represented as multiple chars, I could also not change anything and just use a utf8 parsing lib or function to convert that to proper Unicode wchar_t
.
TBD what I do. I’m guessing there is another issue somewhere since the text file isn’t working either.
Alright @sovcik , I have a test build for you if you wish. No pressure of course :)
This is a "quick fix" so you can at least continue using the tool.
áàéè
etc.std::wcin
reasons ;)I hope this is enough in the short term for your use case. Full support of utf8 will come, but will take more time as I have to refactor a lot of things.
Let me know if you have any issues with the build and thank you for taking the time to report this!
Wow! Thanks! Works much better now. It is absolutely enough for what I need.
I tested it a bit and you are right some accents work, while others not. Examples of not working ones: ľ, ť, ň
.
But as said earlier - works for me for now! Thanks a lot!
:) I'll use those characters as a unit test for the full utf8 support I'm working on. I'll keep this ticket open until that is ready. cheers
Thanks
On Fri, Apr 17, 2020 at 3:52 PM p-groarke notifications@github.com wrote:
:) I'll use those characters as a unit test for the full utf8 support I'm working on. I'll keep this ticket open until that is ready. cheers
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/p-groarke/wsay/issues/2#issuecomment-615256238, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAATZSXLDKGEZNFH655GIHDRNBNJLANCNFSM4MHR6SZA .
I released 1.4 with much better utf character support. Let me know if you find any issues!
Funny side effect, text input supports emojis lol.
@p-groarke Works like a charm! Thanks!
Hi, recently found your nice tool and tried to use it for my mini project. It works in general, but ignores some characters. E.g. character "á" which should sound like "aa" in word "naan" is completely ignored. I tried it via command line and also using your gui tool.
I assume it might be character encoding issue, but couldn't figure it out. When I used MS Word to read aloud, then this character was pronounced properly.
I even tried to pass it as speech XML, but that ignored language tags completely.