morevnaproject-org / papagayo-ng

Papagayo is a lip-syncing program designed to help you line up phonemes (mouth shapes) with the actual recorded sound of actors speaking. Papagayo makes it easy to lip sync animated characters by making the process very simple - just type in the words being spoken (or copy/paste them from the animation's script), then drag the words on top of the sound's waveform until they line up with the proper sounds.
236 stars 51 forks source link

Feature Request - Speech / Phonetics automatic generation/ alignment #49

Open merlin2v opened 6 years ago

merlin2v commented 6 years ago

I've been wondering why the file text had to be used. Couldn't you separate the sound via phonetics? This would be better as it would help translate things more accurate than the text alone. take the following example:

I do like

This could be said as:

adɪ̈ lik _(IPA)_

vs. someone being using pronunciation:

ai dɵ lik _(IPA)_

Both of these end up using different mouth movements and because of this can make some of the mouth movements off.

Hunanbean commented 3 years ago

I put in a request over on the MakeHuman forums to see if someone would update that plugin for us, but no responses. I took a look at the code, but i am no programmer.

I think is has to do with ` Function Arguments

bpy.types.UILayout.column (align, heading, heading_ctxt, translate), was (align)

bpy.types.UILayout.row (align, heading, heading_ctxt, translate), was (align)

bpy.types.UILayout.template_shaderfx (), was (data)

` But again, not sure. I will take another look today, more out of curiosity than of an expectation that i can fix it.

Hunanbean commented 3 years ago

Nope. Nothing i can do at this point. Please disregard what i thought it was. EDIT: Nevermind, i fixed it. Lines, 491, 496, 503 and 509, just change the = before EnumProperty to a :

Here is the fixed version: Papagayo-NGLipsyncImporterForBlender Updated for Blender 2.93

Hunanbean commented 3 years ago

Sorry for this spam post. With all the hassle i had editing the previous message, i did not want @aziagiles to miss the fact it is fixed due to seeing my previous 'failure to do so' message.

steveway commented 3 years ago

That is pretty nice. Yeah we need more up to date plugins. I have an Issue here for that: https://github.com/morevnaproject-org/papagayo-ng/issues/61 We should modify this script too to use the .json export data. The .json files can save some more information, like the tags and it even includes a list of the used phonemes. So before you walk through all the phonemes you can use that information to show the phonemes which will occur. That way you don't necessarily need to create shapes/mouths for all phonemes. I should also try to get the greasepencil importer there into a more usable state.

aziagiles commented 3 years ago

@Hunanbean Thank you. it works great.

Hunanbean commented 3 years ago

@aziagiles Glad i could help! Thanks for letting me know it is working for you. Be well!

aziagiles commented 3 years ago

@steveway I just downloaded the current master branch of Papagayo-NG Allosaurus github version but it failed to load. I guess there is a bug.

steveway commented 3 years ago

Mhh, that should work. I'm using it right now. I just merged a few little improvements from my branch. Can you update and try again? And if it fails to load, can you post the output? Best to start from a command line for that.

aziagiles commented 3 years ago

@steveway Ok. It works now.

steveway commented 3 years ago

Alright, since we now have a few plugins which can import Papagayo-NG Lipsync files to some programs, Blender and Krita for now, I made a little helper tool. While testing I had to re-draw some mouths to see if the results work. Especially at the beginning I had to look up what the phonemes were supposed to look like. For that purpose I made a little tool based on the Pronunciation Dialog that shows you how the mouths look. Just set the mouth image set, the phoneme set and click on the button for the phoneme you wish to know: phoneme_visualizer You can find the code here: https://github.com/steveway/papagayo-ng/blob/master/PhonemeVisualizer.py I guess we will then include this with the installer as it's own .exe file for Windows and for other operating systems it has to be it's own separate program, Appimage for Linux and whatever it is macOS uses.

steveway commented 3 years ago

Apparently PyInstaller has something called "Multipackage Bundles". https://pyinstaller.readthedocs.io/en/latest/spec-files.html#multipackage-bundles With some modification to the .spec file we should be able to create multiple exe files for the main Papagayo-NG program, this Visualizer tool and maybe the Phoneme Conversion Helper Tool I made too.

aziagiles commented 3 years ago

@steveway Hello. I really like the progress you have been putting on the Papagayo-NG program. But this morning, when I did an automatic breakdown of my audio using Allosaurus breakdown, the last phoneme between words was held. I know it will be helpful to lots of people, but I prefer when they go back to rest position in the spaces between words just as in Papagayo-NG version 1.6.3. I thought if I uncheck the 'Hold Phonemes during playback' option in the Preferences/Settings window it will solve the problem, but it didn't.

steveway commented 3 years ago

@aziagiles Yes, I think it makes more sense during playback to only insert rest phonemes if we are not inside a word currently. I've changed the logic a bit so it will behave like that during playback. To note is that this only affects playback in Papagayo-NG, it does not change the data. In my quick test it looks quite good, even with the automatic recognition by Allosaurus. My Greasepencil and Krita Plugins have a checkbox for the rest frames after words and sentences.

aziagiles commented 3 years ago

@steveway You are right, it makes alot of sense the way it is already. My problem is to maybe fix the 'Hold Phonemes during playback' option, so users can choose. To be honest with you, both work well in different cases. As of now, whether one tick the 'Hold Phonemes during playback' option or not, it still holds the phonemes during playback.

steveway commented 3 years ago

I see, the setting was sometimes not loading correctly because QSettings does not handle bools correctly when loading from .ini files. I fixed that. I also changed this one setting and split it into two. One will display the rest frame between every phoneme and the other will only display rest frames between words. rest_frames_settings The descriptions of the setting is also a bit clearer now this way.

aziagiles commented 3 years ago

@steveway Thank you very much for the prompt fix. Hope you're having a nice day. I have a last recommendation for the Papagayo-NG software. It's still related to Holding back phonemes. I was wondering if the code could be modified in such a way that, a user can choose whether the phonemes should be held or not when exporting the file in .dat format or other formats. In the animation I'm currently working on, there are many cases I'll need them not to be held back. I know in your Grease Pencil importer addon, there is a place to check that, but for us using the Lip Sync Importer addon, we can't modify at that stage.

aziagiles commented 3 years ago

@steveway Just downloaded the recent master branch and thanks so much for considering the above recommendation. I believe the ''Show Rest Frames after Words'' function together with the ''Apply Rest Frame settings on Export'' function were actually it. But I think the ''Show Rest Frames after Phonemes'' function can be taken off as I don't believe anyone will ever use it.

aziagiles commented 3 years ago

@steveway Just tested the software again, and it seems like the ''Apply Rest Frame settings on Export'' function didn't work. I think it needs fixing.

steveway commented 3 years ago

Yes, it's not yet doing anything. 3dcc3b7e7c625f70cd5050d9956d1713a2a6cf6a It's a little bit more difficult to add this to the export methods. But I already have an idea how to add it without changing the code too much.

steveway commented 3 years ago

Alright, I've added some functionality. For now it only adds rest frames between words on export to MOHO. And it will only add the rest frame if there is space free in between. The whole thing was a bit more confusing than it needed to be because it seems that MOHO starts at 1 while most other software starts at 0. I'm not sure if we want to add this functionality to the other export options. It's better to add some logic during import and keep the data unchanged. Of course for MOHO that change makes sense since we can't update the importer into their software. Well, we could ask them if they want to support the new JSON format for input and while they are at it they can add the logic to insert rest frames during import depending on user choice like my Blender and Krita Plugins do. Maybe they will be responsive to this since Mike Clifton is back at the wheel of MOHO.

aziagiles commented 3 years ago

@steveway Hello Steve. I'm very happy with your last comment because it goes alongside my line of thought. I believe the 'Show Rest Frames after phonemes' should be taken off, and replace with something like 'Show Rest Frames after Sentences' whereby, when the words within a sentence are being said, phonemes are held back, but when it encounters a silence within the dialogue of say greater or equal to 8 frames (1/3 of a second), the Rest frame appears. And 'Show Rest Frames after Words' should just be the same type of phoneme breakdown as in Papagayo-NG version 1.6.3 and lower where phonemes are not held back. Below, is a test video of a lip sync exercise I did illustrating the concern.

https://user-images.githubusercontent.com/9162114/133477410-284d9e1c-23cb-4064-8528-3af45088b218.mp4

steveway commented 3 years ago

@aziagiles That sounds like a useful feature. There is silence detection in Pydub, I've tried before to split the sounds into words based on that. But it's a bit fiddly, so I didn't get a good result yet. I'll have to do some testing to see how we can combine the information we have from Allosaurus and Pydub to get something usable, that might take some time.

aziagiles commented 3 years ago

@steveway OK

aziagiles commented 3 years ago

@steveway I just downloaded the current master branch of Papagayo-NG Allosaurus github version but it failed to load. I guess there is a bug.

steveway commented 3 years ago

If you downloaded from my fork then it's best to use the master branch. That is the most complete one.

aziagiles commented 3 years ago

@steveway Ok. Let me try once again.

steveway commented 3 years ago

Something more related to the original topic. I just found Vosk, for which there also seems to be some support to get phonemes out of. https://github.com/alphacep/vosk-api/pull/528 While the results from Allosaurus are very good, this might be an interesting alternative.

aziagiles commented 3 years ago

@steveway Adding it in Papagayo-NG alongside Allosaurus and Rhubarb will be a great idea. That will be awesome.

aziagiles commented 1 year ago

@Hunanbean Hello bro. I just downloaded the "Papagayo-NG Lipsync Importer For Blender" addon from your github page and notice, it doesn't work in Blender 3.5.x. Please can it be updated?

aziagiles commented 1 year ago

@Hunanbean I believe the problem comes as a result of modifications made on the Pose Library function in recent versions of Blender, and the addon script will thus needs some retouching in other to function properly taking this into account.

Hunanbean commented 1 year ago

@aziagiles Howdy! Shoot. Ok, i just tested it and am experiencing the same thing. I will talk to CGPT4 about it, because i am still no programmer :) I'll see what we can do

aziagiles commented 1 year ago

@Hunanbean Ok. I get your point. Best wishes as you fix the issue. I believe you can do this.

Hunanbean commented 1 year ago

@aziagiles Ok, i think i've got it worked out.. I will post the fix to the Git as long as it works for you too. to fix it real quick, replace from bpy.props import * with from bpy.props import EnumProperty, FloatProperty, StringProperty, PointerProperty, IntProperty

aziagiles commented 1 year ago

@Hunanbean Ok. will do just that and give you feedback.

Hunanbean commented 1 year ago

@Hunanbean Ok. will do just that and give you feedback.

incase it is easier, i've gone ahead and updated the git at https://github.com/Hunanbean/Papagayo-NGLipsyncImporterForBlender
I will try again, if that does not work

aziagiles commented 1 year ago

@Hunanbean What should I fill in the space of pose library? or I should leave it empty.?The default may be to put "Current File" as that's where my mouth shapes are, but when I do, it doesn't work.

update

Hunanbean commented 1 year ago

@aziagiles Ok, the biggest issues is, i use Shape Keys which got fixed by the change in that line. I am not set up to even try Pose Libraries.. Is there a way you can send me a generic file with a pose library i can use to test with?

aziagiles commented 1 year ago

@Hunanbean receiving error messages. I'm on a Windows 64 bit computer. update

Hunanbean commented 1 year ago

@aziagiles If you have some generic .Blend with a configured Pose library and a test model, i will keep trying to fix the code, but i do not even know how to make pose libraries yet. I am not sure if uploads can be done here, so you could send it to hunanbean.learning@gmail.com if you have such a file

aziagiles commented 1 year ago

@aziagiles Ok, the biggest issues is, i use Shape Keys which got fixed by the change in that line. I am not set up to even try Pose Libraries.. Is there a way you can send me a generic file with a pose library i can use to test with?

Ok. Let me prepare a file and send. Me I use grease pencil, and my phonemes are controlled by a bone.

aziagiles commented 1 year ago

hunanbean.learning@gmail.com

I've just replied you on your email box, with the blend file attached.

Hunanbean commented 1 year ago

I've just replied you on your email box, with the blend file attached.

Ok, i just got it, thanks. I will see what i can do. This is going to take some time though, so probably will not have much until tomorrow or the day after as they are working on the power here, and it is scheduled to be down for several hours :/

aziagiles commented 1 year ago

I've just replied you on your email box, with the blend file attached.

Ok, i just got it, thanks. I will see what i can do. This is going to take some time though, so probably will not have much until tomorrow or the day after as they are working on the power here, and it is scheduled to be down for several hours :/

Ok. Best of luck.

aziagiles commented 1 year ago

@Hunanbean Hello Hunanbean. I just sent you 2 emails. The second one is attached with another blend file containing mouth shapes I did in Blender 3.51 for testing of the lipsync addon. Please check.