Closed C00reNUT closed 8 months ago
Yes!
broken depthflow parallax --image /path/to/background.jpg --depth /path/to/depth.jpg main -r -o ./output.mp4
or just -i image.ext -d depth.ext
You can see other rendering options with depthflow --help
I recommend using -q ultra --ssaa 1.5
for a better quality, beware it's heavy on GPU
One minor inconvenience is that, for now, the output video doesn't follow the input image's one, gotta input the target resolution as well -w 1080 -h 1920
for a 9:16 shorts, for example
very nice, thank you!
Now it renders the video, but I would like to achieve zoom effect or even better dolly zoom, now it just moves a bit from side to side...
hmmm, interesting, I first tried to use recent marigold https://marigoldmonodepth.github.io/ depth estimation because it produces better depth maps than zoedepth, which you use in your repo.
But when I use marigold depth map almost no motion is used in the video, on contrary to when I don't insert the depth map and use zopdepth to calculate it.
Is there some reason it should be like this?
You can change the animation parameters in the .update()
method in the file DepthFlow/DepthFlow.py
There's a zoom out effect commented in the end of the function, you can change the dolly zoom formula in the top, as well as self.parallax_height = (number 0 to 1)
for the intensity of the effect
There's not really a presets mechanism yet, for now graphing tools like Desmos is our best friend on defining the animations
thank you for helping me out, it turns out that I used png image as input for depth, when converted into jpg it works better, but produces much more artifacts than zoedepth.
You can change the animation parameters in the
.update()
method in the file DepthFlow/DepthFlow.py
I will try it!
Just one last thing is there any chance to input more than one image/whole folder?
Interesting this marigold link, will take a look, potentially being the default one, I like that it's with diffusers
My first guess of the why there was almost no motion is that the output wasn't maybe normalized to 0 and 1 or "wrong" data type being read?
When you open the image, is it black and white or does it have that color palette on their showcase? DepthFlow expects a black and white image
Interesting this marigold link, will take a look, potentially being the default one, I like that it's with diffusers
It's very GPU heavy, if I remember correctly more than 12GB for one 1280x720 image. But it's probably best depth estimation out there.
My first guess of the why there was almost no motion is that the output wasn't maybe normalized to 0 and 1 or "wrong" data type being read?
When you open the image, is it black and white or does it have that color palette on their showcase? DepthFlow expects a black and white image
I used BW depth estimation, but needed to convert png to jpg
Just one last thing is there any chance to input more than one image/whole folder?
It's not that hard to automate it, I have done it in some proprietary projects, you could, for example, on DepthFlow's __main__.py
file create your own class
class CustomDepthFlow(DepthFlowScene):
def pipeline(self):
...
# other functions
and then automate it with
for image in Path("/folder/of/images").glob("*"):
# Skip non images
if not (image.suffix in (".jpg", ".png")):
continue
# Create scene and export to video
depthflow = CustomDepthFlow()
depthflow.parallax(image=image)
width, height = PIL.Image.open(image).size
depthflow.main(
ssaa=1.25,
width=width,
height=height,
quality="ultra",
output=image.with_suffix(".mp4"),
)
(Ideally on some other file and import the DepthFlow's package like I did, hacky solution above)
12GB for one 1280x720 image.
Ouch 😅, maybe half precision could help, thanks for letting me know of its existence btw
Maybe one last thing - when I run broken depthflow parallax --image keywords_2.jpg --depth keywords_2_pred.jpg main -r -o /mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/DepthFlow/output.mp4 -w 1280 -h 768
does it use CUDA by default when I installed the repo using broken depthflow poe cuda
?
12GB for one 1280x720 image.
Ouch 😅, maybe half precision could help, thanks for letting me know of its existence btw
Sure, another recent one that is far superior according to metrics is https://zhyever.github.io/patchfusion/ but I haven't tried it
does it use CUDA by default when I installed the repo using broken depthflow poe cuda?
Yes, whenever CUDA is available as seen in this line it gets used automatically
First time running broken depthflow
it defaults to CPU mode (I wonder if I should make CUDA the default one?), then running broken depthflow poe cuda
once uninstalls PyTorch CPU and installs the CUDA on the virtual environment (hacky way, torch packaging is really painful, same package name, multiple "versions")
does it use CUDA by default when I installed the repo using broken depthflow poe cuda?
Yes, whenever CUDA is available as seen in this line it gets used automatically
First time running
broken depthflow
it defaults to CPU mode (I wonder if I should make CUDA the default one?), then runningbroken depthflow poe cuda
once uninstalls PyTorch CPU and installs the CUDA on the virtual environment (hacky way, torch packaging is really painful, same package name, multiple "versions")
Great, well I prefer standard conda env file - install through yaml file, than using poetry for CUDA based projects... all the handling of torch cuda dependencies is much easier...
Thank you for this nice implemtation and help!
12GB for one 1280x720 image.
Ouch 😅, maybe half precision could help, thanks for letting me know of its existence btw
Sure, another recent one that is far superior according to metrics is https://zhyever.github.io/patchfusion/ but I haven't tried it
Just tried it, and LOL here goes 11 GB of my 3060 for that default 4k mountains image and about a minute to process, nearly running out of memory, depth results follows
I don't know why it's grainy, I used full precision on torch.. probably image too big? who knows heh
Given the VRAM usage and time I think zoedepth is the most practical one to be the default, will look forward having options for alternatives depth estimations though
Great, well I prefer standard conda env file - install through yaml file, than using poetry for CUDA based projects... all the handling of torch cuda dependencies is much easier...
Thanks for the feedback about getting the dependencies !
I don't know enough of conda to implement or know its benefits tbh, I remember I had more luck with it on Windows long ago (early Python 3 era, mostly NumPy), nowadays stuff just worked, and binaries exists on PyPI for most of blob packages
If there's demand for it or it proves useful enough I can try integrating it as an alternative, beware I'll need path dependencies for the monorepo, don't know if it's supported on the yaml spec file
Just talking to myself of what I said earlier,
I recommend using
-q ultra --ssaa 1.5
for a better quality, beware it's heavy on GPU
I hope with the new texture class rewrite I'll be able to do multi-passes and store previous computation of displacements of the parallax effect to be reused, potentially saving a lot of GPU computational power!
# In and out dolly zoom
# self.parallax_dolly = 0.5*(1 + math.cos(self.time))
# self.parallax_dolly = 0.9*(1 + math.cos(self.time))
# # Infinite 8 loop shift
# self.parallax_x = 0.1 * math.sin( self.time)
# self.parallax_y = 0.1 * math.sin(2*self.time)
# # Oscillating rotation
# self.camera.rotate(
# direction=self.camera.base_z,
# angle=math.cos(self.time)*self.dt*0.4
# )
# # Zoom out on the start
# self.parallax_zoom = 0.6 + 0.4*(2/math.pi)*math.atan(3*self.time)
I have tried to comment all the code but the default movement path is still present... isn't the poetry copying the file to use it in env or something?
If I understand it correctly by changing these parameters, I shall be able to control the strength of zoom, etc and equations in update control the movement
# Parallax parameters
parallax_fixed = field(default=True)
parallax_height = field(default=0.2)
parallax_focus = field(default=1.0)
parallax_zoom = field(default=0.5)
parallax_isometric = field(default=0.0)
parallax_dolly = field(default=0.0)
parallax_x = field(default=0.0)
parallax_y = field(default=0.0)
Just talking to myself of what I said earlier,
I recommend using
-q ultra --ssaa 1.5
for a better quality, beware it's heavy on GPUI hope with the new texture class rewrite I'll be able to do multi-passes and store previous computation of displacements of the parallax effect to be reused, potentially saving a lot of GPU computational power!
btw when I run broken depthflow parallax --image keywords_6.jpg --depth keywords_6_pred.jpg main -r -o /mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/DepthFlow/output.mp4 -w 1280 -h 768 -q ultra --ssaa 1.5
1280x768 image as input I see only 76MB in 'watch nvidia-smi' so it's not particullary GPU heavy
Unlikely it is poetry messing with the files, I commented these lines on DepthFlow.py
and get no movement at all on the builtin realtime window
If you're on the custom class, you should override update(self)
(I wrote the wrong method on an earlier comment)
My best guess is the later is your case and probably the default implementation update
was still there due inheritance, but can also be the wrong file path you're editing if you cloned twice (it happens sometime)
Yours __main__.py
should look something like:
from DepthFlow import *
class CustomDepthFlow(DepthFlowScene):
def update(self):
# Dolly zoom example
self.parallax_dolly = 10 * math.atan(self.time)
...
def main():
depthflow = CustomDepthFlow()
# The automation thing if run with BATCH=1 depthflow
if os.environ.get("BATCH", "0") == "1":
for image in Path("/folder/of/images").glob("*"):
...
return
# Realtime window or CLI
with BrokenProfiler("DEPTHFLOW"):
DEPTHFLOW.welcome()
depthflow.cli(sys.argv[1:])
if __name__ == "__main__":
main()
1280x768 image as input I see only 76MB in 'watch nvidia-smi' so it's not particullary GPU heavy
Yes, not heavy on VRAM, but it really hits on the pixel shaders circuitry (""rasterization""), when I'm rendering lots of 1440p or 4k images in parallel with some super sampling and really low displacement steps (high quality), the textures accesses per second easily hit the trillions. Somehow current shader is still enough to run at 4k60 on the "ultra" quality in real time on my RTX 3060 and 8k30 (not that the later is practical lol)
It's not documented yet and a bit technical, yet to make some form of infographic, but as you'll be dealing with dolly zoom and the parameters, there's dolly
, isometric
and zoom
parameters
Zoom multiplies the screen projection plane (center screen is 0, top screen is 1, bottom -1) by this amount ("How far can you see the scene"), so 0.5 is a quarter of it zoomed on the center
Dolly is a distance in units of how far "back" from the screen the ray projection starts of the camera (range 0 to infty), for a more "natural" indirect isometric factor specification
Isometric is a number that controls the size of the projection plane of the ray origin relative to the "screen" plane, which have sides equal to fov*isometric
as the screen one is just fov
, a number equal to 1 is equal to dolly=VeryBig
(I'm considering just having Dolly and make isometric set dolly=tan(pi*isometric/2)
but that's a future thing and just writing out ideas)
Edit: Now I think of it, dolly might mess up on raymarching scenes, remembered why both coexist
Unlikely it is poetry messing with the files, I commented these lines on
DepthFlow.py
and get no movement at all on the builtin realtime windowIf you're on the custom class, you should override
update(self)
(I wrote the wrong method on an earlier comment)My best guess is the later is your case and probably the default implementation
update
was still there due inheritance, but can also be the wrong file path you're editing if you cloned twice (it happens sometime)
Thank you it works like charm, I was editing the same file from the https://github.com/BrokenSource/DepthFlow depo I cloned and didn't realize that the wrapper downloads BrokenSource/Projects/DepthFlow with the reference python files... well, falling into the rutine without thinking, thank you once again it is very nice and quick!
Hello, really nice little tool, would it be possible to directly render the video using image and depth map directly from cli?