alembics / disco-diffusion

Other
7.48k stars 1.13k forks source link

Does the 3D camera steer toward the darkest shade in the depth map? #54

Closed MiLO83 closed 2 years ago

MiLO83 commented 2 years ago

Seems to make sense? Kind of collision detection made easy. If it doesn't work this way or have an option to, it should be easy to implement?

voodoohop commented 2 years ago

I came up with a little bit of code to test this idea. I implemented it in a version of VQGAN+CLIP animations but it should be easy to port to DD.

https://colab.research.google.com/github/pollinations/hive/blob/main/notebooks/2%20Text-To-Video/1%20CLIP-Guided%20VQGAN%203D%20Turbo%20Zoom.ipynb

You can see the effect here: https://www.youtube.com/watch?v=B6rSSEwkLMg. Im sure it can be improved.

It calculates the center of mass just by averaging the pixel position that is furthest away. (you can also change it to do the opposite.

I modified the https://github.com/voodoohop/disco-diffusion/blob/main/disco_xform_utils.py to return the normalized depth map to make this work.

if depth_map is not None:
  center_mass = get_center_mass(depth_map, camera_preset_name)
  rotation_3d_x = (-1.0 * center_mass[0]) * 0.1
  rotation_3d_y = center_mass[1] * 0.1
  print("center_mass", center_mass)```

# tool to allow following the furthest place in the depth map using rotation
# could be done much more efficiently with numpy
def get_center_mass(depth_map, preset_name):
  center_mass = [0, 0]
  total_mass = 0
  if preset_name == "follow_closest":
    depth_map = 1.0 - depth_map
  for y in range(depth_map.shape[0]):
    for x in range(depth_map.shape[1]):
      center_mass[0] += y * depth_map[y,x]
      center_mass[1] += x * depth_map[y,x]
      total_mass += depth_map[y, x]

  center_mass = [center_mass[0] / total_mass / depth_map.shape[0] * 2 - 1, center_mass[1] / total_mass / depth_map.shape[1]  * 2 - 1]
  return center_mass
entmike commented 2 years ago

I came up with a little bit of code to test this idea. I implemented it in a version of VQGAN+CLIP animations but it should be easy to port to DD.

https://colab.research.google.com/github/pollinations/hive/blob/main/notebooks/2%20Text-To-Video/1%20CLIP-Guided%20VQGAN%203D%20Turbo%20Zoom.ipynb

You can see the effect here: https://www.youtube.com/watch?v=B6rSSEwkLMg. Im sure it can be improved.

It calculates the center of mass just by averaging the pixel position that is furthest away. (you can also change it to do the opposite.

I modified the https://github.com/voodoohop/disco-diffusion/blob/main/disco_xform_utils.py to return the normalized depth map to make this work.

if depth_map is not None:
  center_mass = get_center_mass(depth_map, camera_preset_name)
  rotation_3d_x = (-1.0 * center_mass[0]) * 0.1
  rotation_3d_y = center_mass[1] * 0.1
  print("center_mass", center_mass)```

# tool to allow following the furthest place in the depth map using rotation
# could be done much more efficiently with numpy
def get_center_mass(depth_map, preset_name):
  center_mass = [0, 0]
  total_mass = 0
  if preset_name == "follow_closest":
    depth_map = 1.0 - depth_map
  for y in range(depth_map.shape[0]):
    for x in range(depth_map.shape[1]):
      center_mass[0] += y * depth_map[y,x]
      center_mass[1] += x * depth_map[y,x]
      total_mass += depth_map[y, x]

  center_mass = [center_mass[0] / total_mass / depth_map.shape[0] * 2 - 1, center_mass[1] / total_mass / depth_map.shape[1]  * 2 - 1]
  return center_mass

You're a legend for sharing this! It is exactly what I was wondering if was possible to implement. Kinda a 3D autopilot. Thanks!