bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.18k stars 165 forks source link

Punctuation fixes #112

Closed agonzalezd closed 2 years ago

agonzalezd commented 2 years ago

This branch fixes #108, recursion errors aligning long texts

agonzalezd commented 2 years ago

I am having some issues while transcribing a specific file... The output is not the same if I run it more than once. The last three sentences sometimes get compromised, and sometimes they return a segmentation fault.

This is the content of the file in question:

Guerre en Ukraine : de l’offensive ratée au carnage, un mois de guerre de l’armée russe.
Les fronts sont figés, quatre semaines après l’invasion lancée par Moscou le vingt-quatre février.
Revers tactiques et pauses volontaires des troupes s’entremêlent et l’hypothèse d’un échec militaire devient envisageable.
La machine offensive s’est bloquée.
Lancée le vingt-quatre février, la fulgurante guerre d’annihilation de l’Ukraine voulue par Vladimir Poutine connaît depuis trois semaines un ralentissement brutal.
Un enlisement réel, masqué par l’orage de feu projeté sur les civils, dans les hôpitaux de Tchernihiv, les banlieues résidentielles de Kiev, le théâtre de Marioupol.
Il est trop tôt pour solder les comptes d’une opération d’invasion qui, en plus de la Crimée et des régions séparatistes du Donbass prises en deux-mille-quatorze, a déjà conquis quarante-neuf-mille kilomètres carrés supplémentaires de territoire ukrainien, autant que le Danemark.
Mais, après un mois de guerre, revers tactiques et pauses volontaires des troupes s’entremêlent, pour dessiner un échec possible de l’armée russe.
Quand François Hollande a jailli de sa voiture, après quatre heures trente de route, il avait un sourire grand comme ça et des fourmis plein les jambes.
Plus de dix ans qu’il n’avait pas prononcé de discours lors d’une campagne présidentielle.
Mardi vingt-deux mars, Anne Hidalgo, la candidate socialiste, donnait un meeting à Limoges, et elle avait invité le dernier président socialiste en date à venir la soutenir.
Un moment qui aurait dû être tout à fait banal, à trois semaines du premier tour, pour essayer une ultime fois de réanimer une campagne à l’agonie.
Les talibans ont ordonné, mercredi vingt-trois mars, la fermeture des collèges et lycées pour les filles en Afghanistan, quelques heures seulement après leur réouverture, a confirmé un responsable taliban.
La prévention s’impose donc à tout citoyen mais aussi à la collectivité, car cette pandémie a démontré la fragilité des systèmes de soins.
En plus de la lutte contre l’alcool et le tabac, la prévention s’appuie sur une meilleure alimentation et une vie plus active et moins sédentaire.
Bien que non satisfaisantes, les prévalences de l’alcoolisme et du tabagisme baissent en France, alors que l’inactivité physique et la sédentarité des Français s’installent.
Près de quarante-deux pour cent des adultes jeunes entre dix-huit et quarante-quatre ans sont sédentaires, c’est-à-dire assis plus de huit heures par jour, avec soixante-dix-neuf pour cent de ce temps passés devant un écran de loisir.
Que deviendront les commandes chinoises, qui représentent un quart des ventes mondiales du dernier-né des sept-cent-trente-sept, estimées à cinq cents appareils cette année?
Qu’en sera-t-il de la conduite de l’enquête sur l’accident?
Qui conseillerait aujourd’hui à un jeune étudiant en droit d’abandonner ses études pour suivre la stratégie d’ascension sociale suggérée par Vautrin?
qu’est-ce qui nous pousse à voter?
Voterait-on dans la seule optique de maintenir la continuité des institutions politiques?

For example:

ki kɔ̃sɛjʁɛt oʒuʁdyi iksy a œ̃ ʒøn etydjɑ̃ ɑ̃ dʁwa dabɑ̃dɔne sez etyd puʁ syivʁ la stʁateʒi dasɑ̃sjɔ̃ sosjal syɡʒeʁe paʁ votʁɛ̃ 
kɛs ki nu pus a vote 
votʁɛtɔ̃ dɑ̃ la sœl ɔptik də mɛ̃tniʁ la kɔ̃tinyite dez ɛ̃stitysjɔ̃ politik 

that "iksy" is not in the original text

Or with a "4++" (katʁ plysplys) after aujourd’hui (oʒuʁdyi):

ki kɔ̃sɛjʁɛt oʒuʁdyi katʁ plysplys a œ̃ ʒøn etydjɑ̃ ɑ̃ dʁwa dabɑ̃dɔne sez etyd puʁ syivʁ la stʁateʒi dasɑ̃sjɔ̃ sosjal syɡʒeʁe paʁ votʁɛ̃ 

Or even weirder outputs:

ki ktl#tl#sɒjʀɒt oʒuʀdʌi ʒeve(kl) ɛ a ɜə ʒən etʌdiy yə dʀwa dabydəne səə etʌdə puʀ səivʀ lə stʀateʒi dasysitl# sosial sʌɡʒeʀe paʀ votʀœ 
kɒəs ki nuə pus a vote 
votʀɒətl#ə dyə lə stsl əptik də mœtniʀ lə ktl#tinəite dəə œstitʌsitl#ə politikə
ki kɔ̃sɛjʁɛt oʒuʁdyi aʁobazld() y ə əə əəə əəəəəə əə əəəə dəəəəəəə əəə əəəəə əəə əəəəə əə əəəəəəəə dəəəəəə əəəəəə əəəəəəə əəə əəəəə?
kəəə əə əəə əəə ə əəəə?
əəəəəəəəə əəə əə əəə əəəəə əə əəəəəəə əə əəəəəəəəə əəə əəəəəəəəəə əəəəəəəə?

Maybe a memory leak or something related to the modified code? I don't know if any of this rings any bell...

I checked and I am not having this issue in the main branch

jncasey commented 2 years ago

Hi, I made this branch to solve a problem I was having on a particular project (as I described in #108). But I haven't tested it extensively, and I've only used it on English text.

That said, I don't seem to be having the problem you're describing on my local machine using this branch.

phonemize --preserve-punctuation -l fr-fr --language-switch remove-flags phonemizing_test.txt 

returns a consistent output, and the last three lines are always

ki kɔ̃sɛjʁɛt oʒuʁdyi a œ̃ ʒøn etydjɑ̃ ɑ̃ dʁwa dabɑ̃dɔne sez etyd puʁ syivʁ la stʁateʒi dasɑ̃sjɔ̃ sosjal syɡʒeʁe paʁ votʁɛ̃? 
kɛs ki nu pus a vote? 
votʁɛtɔ̃ dɑ̃ la sœl ɔptik də mɛ̃tniʁ la kɔ̃tinyite dez ɛ̃stitysjɔ̃ politik? 

...which matches up with what little French I remember from high school and college.

agonzalezd commented 2 years ago

Probably due to different versions? I am currently using the following espeak-ng version:

eSpeak NG text-to-speech: 1.49.2  Data at: /usr/lib/x86_64-linux-gnu/espeak-ng-data

A new version 1.50 seems to be available but I couldn't install it with apt-get.

The texts in English I tried didn't throw any error at all. I haven't checked it in more other languages either.

I am also interested on using phonemizer for long texts, or at least a big quantity of them, which seems a bit complicated as you have experienced. But if I am having inconsistencies and wrong transcriptions, I cannot use this tool. This is why I am trying to revive your branch here :)

Thanks by the way for your quick response

jncasey commented 2 years ago

I ran the recent test on macOS laptop, where I had to compile espeak-ng from source, so my version here is

eSpeak NG text-to-speech: 1.51-dev  Data at: /usr/local/share/espeak-ng-data

I haven't noticed any issues on my Ubuntu workstation, which has espeak-ng installed via apt. But, like I said, I've mainly been working with English text.

agonzalezd commented 2 years ago

I tried with the compiled version of espeak-ng

eSpeak NG text-to-speech: 1.51-dev  Data at: /usr/share/espeak-ng-data

and I am no longer getting this inconsistent output. So I will reopen the merge request and hope some maintainer tries it soon.

Thanks for your help!

hadware commented 2 years ago

Hey, so if I read everything right, this error that you got only occurs on espeak-ng 1.49 (the one available in ubuntu's 18.04 repo), and doesn't occur anymore on 1.51 (the espeak-ng repo version).

It seems that starting at 20.04, they only distribute espeak-ng 1.50 :

[bionic (18.04LTS)](https://packages.ubuntu.com/bionic/espeak-ng) (sound): Multi-lingual software speech synthesizer [universe]
1.49.2+dfsg-1: amd64 arm64 armhf i386 ppc64el s390x
[focal (20.04LTS)](https://packages.ubuntu.com/focal/espeak-ng) (sound): Multi-lingual software speech synthesizer [universe]
1.50+dfsg-6: amd64 arm64 armhf i386 ppc64el s390x
[hirsute (21.04)](https://packages.ubuntu.com/hirsute/espeak-ng) (sound): Multi-lingual software speech synthesizer [universe]
1.50+dfsg-7build1: amd64 arm64 armhf i386 ppc64el s390x
[impish (21.10)](https://packages.ubuntu.com/impish/espeak-ng) (sound): Multi-lingual software speech synthesizer [universe]
1.50+dfsg-7build2: amd64 arm64 armhf i386 ppc64el s390x
[jammy](https://packages.ubuntu.com/jammy/espeak-ng) (sound): Multi-lingual software speech synthesizer [universe]
1.50+dfsg-10: amd64 arm64 armhf i386 ppc64el s390x

I'll look a bit more into your PR after i've merged my own PR (#111 ), and see what should be do. I'm wondering if we shouldn't subclass the EspeakBackend class to something like LegacyEspeakBackend, and add your changes (and eventual new espeak-ng quirks from the past) into that class.... (although I'm not yet sure about that).

jncasey commented 2 years ago

Hi, thanks for writing back. Let me explain what's going on here, just so there's no confusion.

This branch is the version I've been using locally, which solves some specific issues I was having on a project. It includes the --preserve-empty-lines feature I added in PR #103, along with some other changes I needed. @agonzalezd found my branch and opened this PR, because it also fixes problems they were having.

I added tests and feel good about the new feature in PR #103. The rest of the changes work for all of my use cases, but I haven't added any new tests or been all that thorough. In fact, I think the tests might need to be updated to deal with the separator/strip changes I made. I'm not sure how the espeak version issues @agonzalezd was having are related to the changes I made either.

The changes that are present in this branch are:

I would have submitted each fix separately If I had been the one to open the PRs, but hopefully everything together isn't overwhelming!

hadware commented 2 years ago

Oh, sorry for not being a bit more attentive, I haven't been maintaining this package properly for now (I'm the new maintainer btw, @mmmaat game me the keys to his opus magnus).

I'll look into your other PR ASAP, do what needs to be done to merge it, and once that's done, go back to this one. I think that's what you would have expected me to do, right?

jncasey commented 2 years ago

No worries at all! I'm just grateful that this library exists in the first place. I'm no real developer, just a hobbyist with some free time for a personal project.

And yeah, if you want to review the first PR, that'd be great. Once it's good to go, I'm happy to try implementing the other changes in my local branch in a more organized way. Or even to discuss with you if my suggested changes are wanted in the first place.

jncasey commented 2 years ago

@agonzalezd All the changes from this old testing branch have been reimplemented and merged, so you should be able to get what you need from the current master (and you can close this PR)

agonzalezd commented 2 years ago

Perfect! Thanks!