Closed Technologicat closed 11 months ago
A quick look at the source code shows that this comes from the loaders in talkinghead/tha3/poser/modes/
.
The call sequence is as follows:
talkinghead/tha3/app/manual_poser.py
calls tha3.poser.modes.load_poser
"separable_float"
, so this dispatches to tha3.poser.modes.separable_float.create_poser
. The function is called with no arguments other than device
, so it uses its default configuration.The create_poser
function has the following default directory that is used in its file paths:
dir = "talkinghead/tha3/models/separable_float"
This will trigger an exception much later, when trying to use the model and the file is not found at the expected path.
But since the live mode is working fine, and it runs with a different cwd (the top level of SillyTavern-extras), I'd wager the default directory should not be changed.
Maybe the easiest solution is to just adapt the imports in the manual poser, so that we could run both apps from the top level of SillyTavern-extras?
EDIT: Meh, just changing the imports in manual_poser.py
to from talkinghead.tha3...
doesn't work - then talkinghead/tha3/poser/modes/load_poser.py
(which I suppose is shared between the live mode and the manual poser app, so we can't make any changes there that would break live mode) can't find tha3.poser.modes.separable_float
.
The create_poser
function takes a module_file_names
dict, which can take a path for each model part, but the caller load_poser
doesn't take such an argument from its caller, and in any case, using it would break the division of responsibilities (since it's each model's job to know its filenames).
Probably the best solution would be to allow providing an override for the default directory (while keeping the original filenames), and use that in the manual poser?
Maybe I'll just wait if anyone has better ideas. :)
I have no idea, sorry, I didn't develop it. Honestly this module feels an unnecessary burden to support. The results are uncanny and it's very unoptimized. Any reason you're specifically looking into it and not a real live2d plugin?
Ok.
Maybe I should explain my use case. It's not AItubing. :)
At the moment, I'm looking into LLMs mainly for two target applications: retrieval-powered question answering for serious scientific use (to be able to skim papers faster), and AI-powered storywriting for personal use. I'm also interested in an AIDungeon-like interactive text adventure/roleplay use case, but haven't had the time to properly set up and play an adventure yet.
The thing is, I find that interfacing with a faceless LLM feels cold. So I want the system to present itself as a virtual anime character.
Of course, I'm aware that an LLM is a simulator, not a character (a good writeup of the discussion surrounding this is Shanahan et al., 2023); but presenting itself as a specific character yields a nice user interface. Original character, mind you - I'm not interested in making existing anime or video game characters answer my questions about numerical methods and such.
So, I was intrigued by the prospect of an automatic tool to generate character expressions. If I only need to provide one static image per character, that's much more manageable than 28 (when using classify
powered by distillbert). This would allow agile editing of the character's visual appearance.
Instead of a whole evening of inpainting in Stable Diffusion, creating a new character (or modifying the look of an existing one) would only take 30 minutes at most, including everything: playing around with the Stable Diffusion prompt, rerolling txt2img to get the perfect shot, and finally automatically removing the background with rembg.
As for why, agile is not just a software development methodology, it's a lifestyle.
Generating a set of static images by an offline (batch) process would be fine, which is why I was looking into the manual poser.
The live mode is just a nice bonus. It would make the character feel more alive, and as I said, it works fine, but as you said, it needs optimization.
As for live2d, it's proprietary so I'd rather avoid it, and I'm not even sure if it runs on Linux. Their download page didn't say.
Anyway, while I don't have a lot of time for extra software development projects, I might take a look if I can get the manual poser working.
In any case I think talkinghead
is a really cool technology demo, so if possible, I'd prefer keeping it around. This and websearch
are why I installed extras
in the first place.
I got the manual poser working!
Only needed to pass the correct model paths without breaking the app.py
use case.
Small PR to follow.
Obviously, a single static image has its limitations, but in a reasonable parameter range, at least for the example character... wow. Just, wow.
This is exactly what I was looking for.
Posted the PR. See #203.
Closing this since the app works now. Let's track the ongoing changes in the PR ticket.
I'd like to produce static expression images using this, but I'm having trouble running the manual poser app.
extras
conda venv activated when I run this.The manual poser seems to expect a
live2d
folder; is live2d a dependency, or is a blank folder enough? By a quick look in the source code, this doesn't seem to be actually used except as a target directory. So, I created a blanktalkinghead/live2d
folder.Running
python -m tha3.app.manual_poser
, in a terminal at thetalkinghead
top-level folder, the manual poser application starts up.Note that the manual says to instead use
python tha3/app/manual_poser.py
, but this doesn't work. The app crashes upon startup, becausesys.path
differs from what imports in the app expect. When running with-m
, the cwd is added tosys.path
, so that looking up modules liketha3.poser
ortha3.util
works (when we invoke thepython -m ...
command in the folder that contains thetha3
folder).But with the latter option (running a
.py
file, not a module), it is the script's containing folder, i.e. thetha3/app
folder, that is added tosys.path
. Then trying to look up modules in a module path beginning withtha3
will not work, since the top level is already attha3.app
. (Relative imports wouldn't work, either, because in this mode the main script is not treated as a module - yeah, Python's import system can be confusing.)Anyway, this way I got the app to start, and the window appeared, but I still couldn't get it to do anything.
Loading an image to test with (e.g.
talkinghead/tha3/images/example.png
), the app raises the following exception:Note the path.
The file
tha3/models/separable_float/eyebrow_decomposer.pt
does exist (I've actually installed the full model package mentioned in the manual), but we're already running this insidetalkinghead
.Trying to run the manual poser from the top-level SillyTavern-extras folder as
python -m talkinghead.tha3.app.manual_poser
does not work either, because then the module paths will be wrong again.Any ideas?
EDIT: Note that the live mode of talkinghead is working fine; it's only the manual poser app I'm having trouble with.