steveway / papagayo-ng

Papagayo is a lip-syncing program designed to help you line up phonemes (mouth shapes) with the actual recorded sound of actors speaking. Papagayo makes it easy to lip sync animated characters by making the process very simple - just type in the words being spoken (or copy/paste them from the animation's script), then drag the words on top of the sound's waveform until they line up with the proper sounds.
http://steveway.github.io/papagayo-ng/
18 stars 3 forks source link

Crash on usage of Allosaurus model eng2102 #35

Closed T-oasterO-ven closed 2 years ago

T-oasterO-ven commented 2 years ago

Upon initiation of Allosaurus voice recognition model "eng2102" I get this dialogue pop-up:

Screenshot (750)

Then the program crashes to the desktop.

*I fully redownloaded and reinstalled the newest version with no other settings changed---all download extensions were also reinstalled properly---this test was run after the usual FFmpeg restart.

steveway commented 2 years ago

Ah yes, Allosaurus spit out a phoneme which is not in the ipa_cmu.json conversion. But the problem for the crash is that QT doesn't like that I try to show a messagebox then because we are in another thread. I'll have to show that information for the missing phoneme another way then. For now, if you use a command line to start papagayo-ng then it should also spit out which phoneme it had problems with on the command line. You could then add a conversion to the ipa_cmu.json file. Please also tell me which one it is here, so we can add that for everyone in the future.

T-oasterO-ven commented 2 years ago

Ah yes, Allosaurus spit out a phoneme which is not in the ipa_cmu.json conversion. But the problem for the crash is that QT doesn't like that I try to show a messagebox then because we are in another thread. I'll have to show that information for the missing phoneme another way then. For now, if you use a command line to start papagayo-ng then it should also spit out which phoneme it had problems with on the command line. You could then add a conversion to the ipa_cmu.json file. Please also tell me which one it is here, so we can add that for everyone in the future.

C:\Program Files (x86)\Papagayo-NG>papagayo-ng.exe
Misc
Misc
Voice
Changed!
2209
valid sound
self.sound.Duration(): 2
frameRate: 24
soundDuration1: 53
soundDuration2: 54
Missing conversion for: ɹ̩

C:\Program Files (x86)\Papagayo-NG>

After adding in the conversion the 3-second audio file processed nicely. However, after running "The Missile Knows Where It Is..." through Papagayo-NG---there were certainly more missing conversions. I compared "stan1293.txt" (what eng2102 model uses) to the "ipa_cmu.json" list. I believe this is what is missing (check for yourself just in case):

"ɹ̩": "ER", "t͡": "CH", "d͡ʒ": "JH",

stan1293.txt

steveway commented 2 years ago

Alright, I went through that list and added the missing phonemes from there. I also changed that part so instead of trying to display a messagebox it will log that message. And all other parts where we spit information out via print are now replaced by logging. So in your Appdata folder for Papagayo-NG there should now be a errorlog.log file after you run it. Actually I will like rename it since we are not only logging errors but just information in general.

T-oasterO-ven commented 2 years ago

Thanks for the fixes. Quick update, checked the new ipa_cmu.json list, pretty sure ɹ̩ is the "ER" sound---or at least that is what these sources have stated:

University of Manitoba TeflPedia

steveway commented 2 years ago

Ah yes, you are right. "ER" seems to be a closer fit than just "R", I fixed that now. I also renamed the logfile so it doesn't imply that it contains only errors.