BrokenSource / DepthFlow

🌊 Image to → 2.5D Parallax Effect Video. High quality, user first. Free and Open Source Leiapix alternative
https://brokensrc.dev
GNU Affero General Public License v3.0
153 stars 11 forks source link

Awesome project #5

Closed davidmartinrius closed 1 week ago

davidmartinrius commented 5 months ago

This project is absolutely awesome. The inference works so fast. I am thinking to integrate it to stable diffusion ui. Currently there is a similar plugin for the sd ui called https://github.com/thygate/stable-diffusion-webui-depthmap-script

But that one is painfully slow, first create a depthmap, then a mesh and then creates a video from the mesh. The mesh generation can take like 2-3 minutes and only works with CPU. (once generated the mesh you have to wait another time for video generation) And the result is almost the same as your project. With the clear difference that the results of your project are much better in quality, speed and also it uses CUDA.

So, congratulations for your project!

David Martin Rius

Tremeschin commented 5 months ago

Thanks a lot & I love this idea ❤️

With some mechanism to define presets of the camera movements and effect intensities, maybe also post fx this will be super

I never studied deep enough any of the Stable Diffusion world, I'll keep this idea in mind. Not much bandwidth for now though

ps. I have some private shader with some pfx so I've seen things 😂

HamletEagle commented 3 months ago

Hello @Tremeschin,

This project is amazing. Any plans to add the ability to define some movement (left/right/circular/zoom/dolly zoom) options from the command line so that we can animate the resulting video? It would make your work even more amazing than it already is.

You can take a look at https://www.leiapix.com/ to see what the end result would look like after having the option to define movement/zoom.

Thank you!

Tremeschin commented 3 months ago

Thanks @HamletEagle for the words and the resource on possible presets! and brand it as one more $competitor alternative heheh

Well, we can change anything manually just a matter of editing the .update function on the main class. But it requires some practice in shaping animations with functions in math, and I'm aware it isn't trivial, accessible and convenient

I just haven't tried to mock up yet a presets system for the scenes as I'm focusing on the raw backend lately. I'll prioritize this, as it's a recurrent issue on this project and two other "spin-offs" of shaderflow (pianola and spectronote)

(words to self) I should stop thinking on solving everything on the first iteration of it and just make a mvp :)

Tremeschin commented 3 months ago

(off topic (?))

Wait what 😳, they charge 5 dollars for just 10x 4k videos?? (didn't find length, likely looped animations < 10 sec?)

Can generate those 10x 4k60 10 sec videos here in about 90 seconds

I did some math previously for depthflow and it's so fast and cheap running it locally, my system alone could probably serve like 30k monthly active users generating 400x 1080p videos per month each (or 100x 4k).

If it was 720p and considering many users barely uses the service we're talking 1M++ users

And the main bottleneck isn't even the GPU but CPU encoding times for the video, which can be done on a second GPU on the system just for NVENC (even better that NVIDIA recently increases simultaneous encoding streams)

Their depth maps are really better than the one being used currently on depthflow (ZoeDepth), I'm not that much into ML to develop my own but absolutely be sure to link better OSS new ones !

davidmartinrius commented 3 months ago

Actually you could implement serverless instances in runpod.io for each inference/user execution, so you do not need to maintain any infrastructure and queues. Or if you wanted to use your own gpu you can leverage from Nvidia triton server. It manages queues and balances the vram usage for simultaneous inferences.

And instead of using ZoeDepth you could use https://github.com/zhyever/PatchFusion?tab=readme-ov-file

It is much better than ZoeDepth

For the frontend you could use a react/react native + backend with django or laravel and a gateway with stripe or a crypto wallet

Tremeschin commented 3 months ago

Oh, nice! Definitely will educate myself on those

I'd probably need to containerize the code and make proper APIs for it, would be a fun journey

Gradio now supports OAuth and it's pure Python, making an UI on it and integrating with Stripe seems the easiest for me, but it's more like "have your own private or public instance here" so why I hinted on local infra (+rev proxy)

Too bad I have some innate fear to databases and user authentication, I feel like I'll mess it up not if but when 🙈

I could try spinning up some service out of it yea, I'm betting too hard on the OSS projects and "ignoring" the traditional life (yet to graduate also), but I feel like getting a community first, can't buy it, then doing it together with someone more capable than me on web dev and cloud..


..or just don't, and live off sponsorships and donations, I don't know if I'm doing the right thing, difficult transition in life

I have some support but my family just sees it as "nice to see on social media but useless 3d animations", "silly piano keys being pressed on the screen" and "but why you did electrical engineering?" 😆