GHPS / txt2pho

A TTS frontend for the German inventories of the MBROLA project (Official Repository)
GNU Affero General Public License v3.0
6 stars 1 forks source link

Bug: txt2pho crashes without output on non-Intel platforms #4

Closed StephanRichter closed 2 years ago

StephanRichter commented 2 years ago

On my system (Linux Mint 20.2 Cinnamon, Kernel 5.4.0-80-generic), I don't get any output from the current (5e3cc538b08423530089783b433942d1691baa77) txt2pho:

git clone https://github.com/GHPS/txt2pho.git
cd txt2pho
make clean
make all

throws no error, and the binary txt2pho is assembled.

Also I can start it:

$ ./txt2pho -v
txt2pho 0.96

but running your examle yields nothing:

$ echo "Hallo Welt"|./txt2pho -m
$
$ echo "Äpfel"| iconv -cs -f UTF-8 -t ISO-8859-1|./txt2pho -m
$
GHPS commented 2 years ago

Thanks for reporting the issue!

You can turn on debugging and error messages with the -d option:

echo "Hallo Welt"|./txt2pho -m -d 11

This will generate two files in the /tmp directory

The error log will most likely give you some hint to want went wrong during the test run.

If I would simply guess what went wrong I would say that txt2pho can't find its data files. Do you see any output if you specify the path directly:

echo "Hallo Welt"|./txt2pho -m -p data/

fquirin commented 2 years ago

Same problem here.

echo "Hallo Welt"|./txt2pho -m -d 11

This doesn't even create a log file :disappointed_relieved:

echo "Hallo Welt"|./txt2pho -m -p data/

No change. It just hangs as if it's waiting for more input.

Compiled on Raspberry Pi Bullseye (Debian 11) 32bit if that helps.

GHPS commented 2 years ago

No change. It just hangs as if it's waiting for more input.

Strange - if you start typing, does it convert this text into phonems?

Or even more basic, does

txt2pho -h

print any help text?

fquirin commented 2 years ago

Strange - if you start typing, does it convert this text into phonems?

nope, nothing seems to do anything. You can only abort with CTRL+C.

does txt2pho -h print any help text?

yes, I get the v0.96 info and all the arguments help

I tried to compile with gcc-8 as well, no difference. preproc works fine btw and I don't see any errors during build.

GHPS commented 2 years ago

I tried to compile with gcc-8 as well, no difference.

Yes - the code does not require any specific compiler version. Any version between gcc-8 and gcc-11 should work fine.

nope, nothing seems to do anything. You can only abort with CTRL+C.

OK, then is time to take a bigger hammer.

Please generate a short test file 'HalloWelt.txt' with just 'Hallo Welt' as the only content.

And ask strace what goes wrong:

strace -e trace=%file ./txt2pho -m -i HalloWelt.txt

You should immediately see from the output where txt2pho gets off the rails.

If not, please post the results of the strace run...

fquirin commented 2 years ago

Here is the result:

execve("./txt2pho", ["./txt2pho", "-m", "-i", "HalloWelt.txt"], 0xbef83664 /* 23 vars */) = 0
access("/etc/ld.so.preload", R_OK)      = 0
openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
readlink("/proc/self/exe", "/home/pi/txt2pho/txt2pho", 4096) = 24
openat(AT_FDCWD, "/usr/lib/arm-linux-gnueabihf/libarmmem-v7l.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libstdc++.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libm.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libgcc_s.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3

:thinking:

GHPS commented 2 years ago

Here is the result:

Thanks for the log.

I assume that this is the full output - which means that txt2pho is not even trying to load the config or data files. Does trace=all give any clue about the reason for this strange behaviour?

fquirin commented 2 years ago

Here is the full trace=all output:

pi@rpi4b:~/txt2pho $ strace -e trace=all ./txt2pho -m -i HalloWelt.txt
execve("./txt2pho", ["./txt2pho", "-m", "-i", "HalloWelt.txt"], 0xbedcc664 /* 23 vars */) = 0
brk(NULL)                               = 0x1da7000
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f40000
access("/etc/ld.so.preload", R_OK)      = 0
openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=54, ...}) = 0
mmap2(NULL, 54, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0xb6f3f000
close(3)                                = 0
readlink("/proc/self/exe", "/home/pi/txt2pho/txt2pho", 4096) = 24
openat(AT_FDCWD, "/usr/lib/arm-linux-gnueabihf/libarmmem-v7l.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\254\3\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=17708, ...}) = 0
mmap2(NULL, 81964, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6efb000
mprotect(0xb6eff000, 61440, PROT_NONE)  = 0
mmap2(0xb6f0e000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0xb6f0e000
close(3)                                = 0
munmap(0xb6f3f000, 54)                  = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=47271, ...}) = 0
mmap2(NULL, 47271, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6f34000
close(3)                                = 0
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libstdc++.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0X\216\7\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=1532776, ...}) = 0
mmap2(NULL, 1605012, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6d73000
mprotect(0xb6ee2000, 65536, PROT_NONE)  = 0
mmap2(0xb6ef2000, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16f000) = 0xb6ef2000
mmap2(0xb6ef9000, 7572, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6ef9000
close(3)                                = 0
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libm.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\20\222\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=386572, ...}) = 0
mmap2(NULL, 450684, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6d04000
mprotect(0xb6d62000, 61440, PROT_NONE)  = 0
mmap2(0xb6d71000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x5d000) = 0xb6d71000
close(3)                                = 0
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libgcc_s.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\270\321\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=116324, ...}) = 0
mmap2(NULL, 180532, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6cd7000
mprotect(0xb6cf3000, 61440, PROT_NONE)  = 0
mmap2(0xb6d02000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b000) = 0xb6d02000
close(3)                                = 0
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\240\255\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1321488, ...}) = 0
mmap2(NULL, 1390760, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6b83000
mprotect(0xb6cc2000, 61440, PROT_NONE)  = 0
mmap2(0xb6cd1000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13e000) = 0xb6cd1000
mmap2(0xb6cd5000, 6312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6cd5000
close(3)                                = 0
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f32000
set_tls(0xb6f32500)                     = 0
mprotect(0xb6cd1000, 8192, PROT_READ)   = 0
mprotect(0xb6d02000, 4096, PROT_READ)   = 0
mprotect(0xb6d71000, 4096, PROT_READ)   = 0
mprotect(0xb6ef2000, 20480, PROT_READ)  = 0
mprotect(0xb6f0e000, 4096, PROT_READ)   = 0
mprotect(0x80000, 4096, PROT_READ)      = 0
mprotect(0xb6f42000, 4096, PROT_READ)   = 0
munmap(0xb6f34000, 47271)               = 0
brk(NULL)                               = 0x1da7000
brk(0x1dc8000)                          = 0x1dc8000
getpid()                                = 22977
GHPS commented 2 years ago

Here is the full trace=all output:

Thanks for the log!

The problem gets more interesting the closer we look at it...

As far as I can see the program runs normally - loads the necessary shared libraries and allocates the required memory. And then it dies without any obvious reason.

The very next step in the strace log would be loading the config info from ~/.config/txt2phorc followed by the lexicon files in data/. But his doesn't happen.

I'll have to get my RaspPi out to take a closer look.

fquirin commented 2 years ago

I'll have to get my RaspPi out to take a closer look.

That'd be great. I fear I cannot help anymore at this point :-/ When I've time I'll try to build on my x86_64bit Linux Machine just to make sure it runs there.

GHPS commented 2 years ago

When I've time I'll try to build on my x86_64bit Linux Machine just to make sure it runs there.

No need to change the platform - I can reproduce the problem on my RaspPi 1 with gcc 8.3.0 (Raspbian 8.3.0). All versions of txt2pho back to 0.95 are affected.

The problem seems to be caused by signed vs. unsigned chars on Intel vs. ARM.

Forcing the compiler to use signed chars makes txt2pho usable on the RaspPi:

make all CFLAGS="-g -ansi -fsigned-char"

Please try building txt2pho with this line.

fquirin commented 2 years ago

It works :partying_face: , great!

I've been testing it with MBROLA 'de3' and it makes this voice actually usable compared to espeak-ng ^^. I'll probably add this as an additional TTS option to SEPIA Open Assistant. Switching between "male" and "female" settings seems fun too :grin: .

Too bad it only works for German. Have there ever been some versions for other languages? I guess it requires adaptation to different phonemes and probably new rules?

GHPS commented 2 years ago

It works partying_face , great!

Yes - that is good news. After some further testing I'll release a new version of txt2pho to make it usable on the RaspPi (and other non-Intel platforms) out of the box.

Testing has also to be done on the original, first issue. @StephanRichter please see if this approach solves your problem.

I've been testing it with MBROLA 'de3' and it makes this voice actually usable compared to espeak-ng ^^. I'll probably add this as an additional TTS option to SEPIA Open Assistant.

Yes - that would be awesome!

BTW for my daily work I prefer the de2 voice. Within the last months I pretty much ironed out all mispronunciations I came across so the speech output is in most cases flawless.

Too bad it only works for German. Have there ever been some versions for other languages? I guess it requires adaptation to different phonemes and probably new rules?

The approach of txt2pho is twofold: It has a database with more than 52k entries of German words and their correct pronunciation. Words not directly found in this database are decomposed into compounds to get all parts of a word right.

The whole approach is based on an enormous amount of work done at the Uni Bonn in the 90s. I'm therefore quite happy that their - foremost the lead developer Thomas Portele - released this work into open source.

And, yes, there is zero chance to port txt2pho to other, non-German languages.

My long term plan is to port it to Python which would make the program much more maintainable and future proof.

fquirin commented 2 years ago

BTW for my daily work I prefer the de2 voice. Within the last months I pretty much ironed out all mispronunciations I came across so the speech output is in most cases flawless.

Did you push these changes to the official MBROLA voices repository or is this a txt2pho rules-set? With other words ... how do I get these optimizations? ^^ (if I don't have them already)

And, yes, there is zero chance to port txt2pho to other, non-German languages.

Too bad, but at least we can still profit from the Uni Bonn work for German :-). Most of the modern open-source Deep-Learning frameworks have great voices but they are very resource demanding and there is not much effort put into optimizing individual pronunciations and input text. That's why I've been using older systems quite a lot like Mary-TTS.

My long term plan is to port it to Python which would make the program much more maintainable and future prove.

I'm not a big fan of converting everything to Python but it would certainly help to maintain it, I agree.

GHPS commented 2 years ago

Did you push these changes to the official MBROLA voices repository or is this a txt2pho rules-set? With other words ... how do I get these optimizations? ^^ (if I don't have them already)

All the changes are already released in this repo (see the commit history for details). mbrola's task is to convert the phomens generated by txt2pho to an audio file.

clort81 commented 2 years ago

Thank you for this issue and the fix.

To complete a resolution, please incorporate proper build flags for arm into makefile, or at very least add build instructions to the readme.

GHPS commented 2 years ago

To complete a resolution, please incorporate proper build flags for arm into makefile, or at very least add build instructions to the readme.

Thanks for the hint.

The docs and makefile will be updated shortly - after some further testing...

GHPS commented 2 years ago

Final remark: The project should now build without problems on any architecture without additional modifications.