Open Maki9009 opened 1 year ago
Unfortunately the automatic1111 webui is a nightmare to integrate with; between modifications that have to be made to the model, sampling process, and attention implementation, I suspect I'd end up wanting to claw my eyes out even more than I usually do when working with that codebase.
PRs are welcome (if they have decent code quality or people don't mind me rewriting half of it before merging) but I am absolutely not going to integrate with auto1111 myself.
That said, I am going to add a little webui to this, and I have someone lined up to make a colab notebook for it once that's done ๐
well not using Automatics webui... but use the webui from the same original repo
The original repo has a webui? That's news to me...
Oh hey, so it does! Neat. well, that was already on my list of things to do, so I'll try to get to that in the next day or two depending on how busy I am ๐
Thank you :), because not gonna lie i reached the final part of the installation and generating... and i didn't know how to do that i kept pointing to the yaml file and nothing was happening... I'm pretty sure i was doing something wrong, but a webUI would be dope
Ah, yeah, this CLI takes .json files rather than .yamls (im not a huge fan of yaml for this stuff); there's some examples in config/prompts
, they're pretty close to the original yaml format but not exactly the same.
I'll probably be able to throw something together in Gradio tomorrow (i'm in Australia so it's pretty late here ๐) but no promises!
i mean .json file yeah, i got to this point in the installation process, animatediff --help.. tried to point to my json file... couldn't figure it out
animatediff generate -c 'config/prompts/json-name.json' -W 576 -H 576 -L 16
should work - oh, one thing i should mention is that the path to the checkpoint in the JSON is relative to the data/
directory, so if your checkpoint is in (say) data/models/sd/SomethingV2_2.safetensors
then you'd put something like this:
{
"name": "miku",
"base": "",
"path": "models/sd/SomethingV2_2.safetensors",
"motion_module": "models/motion-module/mm_sd_v14.ckpt",
"seed": [-1, -1],
"scheduler": "ddim",
"steps": 20,
"guidance_scale": 8.25,
"prompt": [
"hatsune miku, 1girl, upper body, clouds, twintails, happy, smile, hand on own cheek, :d, looking at viewer, from side, best quality",
"hatsune miku, 1girl, full body, blue hair, twintails, happy, smile, looking at viewer, dramatic, masterpiece, best quality",
],
"n_prompt": [
"simple colors, simple background, low quality, worst quality, bad quality"
]
}
If you save that as config/prompts/something.json
, then you'd run:
animatediff generate -c 'config/prompts/something.json' -W 576 -H 576 -L 16
and that should Just Workโข (you'll also need to have downloaded the motion module from the original repo's gdrive links and put them in data/models/motion-module/
)
[edit]: Side note: Use PowerShell, not cmd.exe :)
alright well idk whats wrong atm... but the first time the installation process worked, now python3.10 -m venv .venv source .venv/bin/activate
arent working, py -m venv .venv i think that does the job because it doesn't bring a error that python isn't recognized but source doesn't work it doesn't recognize it.... not sure what went wrong cuz it worked the last time i tired to install it
i tired both Powershell and cmd
I've been working on integrating it into my GUI for Stable Diffusion Deluxe and almost got it all working. Started with the originals AnimateDiff, but quickly realized that all the dependencies are "ancient" and just wouldn't work with the rest of my project that requires latest Diffusers and Torch.. So I appreciate you refactoring things. Here's what it looks like so far.... Almost works now, but still hitting an error:
โญโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโฎ
โ /content/animatediff-cli/src/animatediff/cli.py:294 in generate โ
โ โ
โ 291 โ model_name_or_path = get_base_model(model_name_or_path, local_dir= โ
โ 292 โ โ
โ 293 โ # Ensure we have the motion modules โ
โ โฑ 294 โ get_motion_modules() โ
โ 295 โ โ
โ 296 โ # get a timestamp for the output directory โ
โ 297 โ time_str = datetime.now().strftime("%Y-%m-%dT%H-%M-%S") โ
โ โ
โ /content/animatediff-cli/src/animatediff/utils/model.py:185 in โ
โ get_motion_modules โ
โ โ
โ 182 โ โ โ โ local_dir_use_symlinks=False, โ
โ 183 โ โ โ โ resume_download=True, โ
โ 184 โ โ โ ) โ
โ โฑ 185 โ โ โ logger.debug(f"Downloaded {path_from_cwd(result)}") โ
โ 186 โ
โ 187 โ
โ 188 def get_base_model(model_name_or_path: str, local_dir: Path, force: bo โ
โ โ
โ /content/animatediff-cli/src/animatediff/utils/util.py:43 in path_from_cwd โ
โ โ
โ 40 โ
โ 41 โ
โ 42 def path_from_cwd(path: Path) -> str: โ
โ โฑ 43 โ return str(path.absolute().relative_to(Path.cwd())) โ
โ 44 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
AttributeError: 'str' object has no attribute 'absolute'
I was originally using absolute path to point to the motion_module and lora because wanted it working with Windows paths too, but I modified it to relative path like the example, but still gives that error. So close, any suggestions? Thanks.
i now have the same error as you^
@Skquark i got it to work, Basically i threw the error message to Claude AI and the solution is simple, just follow this.
"It looks like the issue is still occurring where a string is being passed to path_from_cwd() instead of a Path object.
Based on the traceback, in src/animatediff/utils/model.py, this line:
python
Copy code
logger.debug(f"Downloaded {path_from_cwd(result)}") Is still passing result as a string rather than a Path.
To fix, you'll need to update that section like we discussed earlier:
python
Copy code
result = get_motion_modules() path = Path(result) logger.debug(f"Downloaded {path_from_cwd(path)}") The key change being wrapping result in Path() before passing to path_from_cwd.
Let me know if this helps resolve the issue! We just need to make sure a Path object is being passed rather than the raw string result."
and that made it work
Good stuff, easy enough fix, been meaning to try Claude AI for those extra long prompt requests. Got my UI working now, loving it... Submitted PR #11 with that little update, plus downgraded minimum versions of a few requirements so it runs smoothly on Colab. My GUI is at DiffusionDeluxe.com for Google Colab or local desktop, but haven't tested if AnimateDiff is working locally since my GPU isn't good enough, so if anyone wants to try that I'd appreciate. Still working on refining the implementation to perfect all the features, open to all contributions.
Minor feature request for it to work better in the GUI is to save the image frames to output folder as they're being generated rather than at the end of run so you can preview the progress as they're being created.
Might need someone to confirm it but I believe the frames are generated in parallel. It may be possible to display it live by running a seperate process on the cpu to create a GIF of the animation while it is in process.
BTW that's a boat load of features you've got in your UI haha. How long did it take for you to implement them all?
I'm kind of surprised there isn't more interest in animateDiff though I think the lack of control over the output is keeping people away.
You might be right about them generated in parallel, at least to the number of Context frames to condition. No biggie if that can't be done. I've been working on developing this UI for about 10 months now.. I haven't made it too public of a release yet since I wanted to make sure it's 100% functional first, but pretty much there. Got addicted to all the new features I'm constantly adding, and AI developments have been moving fast. I wasn't a big fan of the Auto1111 Gradio WebUI, so had to create an alternative, glad you like. I'm enjoying AnimateDiff so far, not perfect but pretty coherent compared to alternatives. Especially now that you made it unlimited length, I've got many test prompts I can't wait to run. Was also liking Pika Labs text2video results, but it's closed-source and limited to 3 seconds. Anyways, please let me know if there's any suggestions to improve upon the implementation, like additional good LoRA safetensors to add to the list, the code is open...
A part of me wonders if training an LLM to use traditional CGI software to generate movies would allow better control and coherency over the final output than using tools like animateDiff and Pikalabs.
These models for text2mov aren't able to store the physical information of the pixels they're generating. It works well generating a scene from the POV of a car moving forward but once the car reverses it'll be very difficult for these models to generate images that is consistent with what the viewer has previously seen.
I think these models would be very useful in generating proof of concept/trailers though.
However in hindsight. Once LLMs can do that there's probably not much left that we humans actually need to do.
Great, looking forward to using it on webui
We have AnimateDiff working on twisty.ai as the default generator with 4 of the style models. It's the original distribution with minor tweaks, running on replicate. Currently we are moving this package over to AWS so we can keep it warm and render more quickly. We would like to integrate your version for performance and longer video run length. Any suggestions on how we support your version? Do we just start from scratch and wrap your system in replicates COG wrapper? We use a gem economy for rendering, currently free. Eventually it will be priced to support the rendering expense and provide something back to the model, LoRa and generator creators.
pls