SUDO-AI-3D / zero123plus

Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
Apache License 2.0
1.56k stars 108 forks source link

How to complete 3D reconstruction #11

Open wander2017 opened 8 months ago

wander2017 commented 8 months ago

How to complete 3D reconstruction? Image results cannot be reconstructed using colmap。 Can you mention some of the fake gestures?

eliphatfs commented 8 months ago

What do you mean by fake gestures? Classical methods for 3D reconstruction usually cannot work well with the sparse views. Another model would usually be needed to reconstruct mesh. For example, you can utilize narrow baseline 0123 + SparseNeus from One-2-3-45 https://github.com/One-2-3-45/One-2-3-45 for 3D reconstruction.

wander2017 commented 8 months ago

thank you for you reply。I will try it

wander2017 commented 8 months ago

One2345_training_pose.json

wander2017 commented 8 months ago

I want to be able to get a pose file like One2345_training_pose.jsonto complete 3D reconstruction

Ratinod commented 8 months ago

Test showing what results can be achieved using only 7 images in Instant-NGP:

https://github.com/SUDO-AI-3D/zero123plus/assets/12137233/005bb9db-1d83-4c5f-ae14-495b7b83e27f

https://github.com/SUDO-AI-3D/zero123plus/assets/12137233/40fa3472-c002-406d-9950-e5f20730f459

https://github.com/SUDO-AI-3D/zero123plus/assets/12137233/126b7139-5a6d-4e95-ab12-4f203022eb9d

https://github.com/SUDO-AI-3D/zero123plus/assets/12137233/1cbbc395-f590-41e1-8ae7-34bdcc3dfb78

Obviously, more images are needed to achieve better and more accurate results. But considering that this is only 7 images, the result is impressive.

Here are the contents of the "transforms.json" file that I got (In case anyone wants to try):

{
  "camera_angle_x": 0.03812197129202137,
  "camera_angle_y": 0.0140774200158162,
  "fl_x": 320.719333182715,
  "fl_y": 320.1815001140895,
  "k1": -0.01956388203737949,
  "k2": 0.110182871165539e-05,
  "p1": -0.05669583599777802,
  "p2": -0.0175356537000764e-05,
  "cx": 160.9954396730627,
  "cy": 160.2710416469586,
  "w": 320.0,
  "h": 320.0,
  "aabb_scale": 16,
  "scale": 2.7,
  "frames": [
      {
      "file_path": "./images/image0.png",
      "sharpness": 27.732116033649934,
      "transform_matrix": [
        [0.9999500513076782, 0.004810694605112076, 0.008763475343585014, 0.010515931993722916],
        [0.008762343786656857, 0.0002564578899182379, -0.9999614953994751, -1.1999539136886597],
        [-0.004812757018953562, 0.9999883770942688, 0.00021429210028145462, 0.00025705056032165885],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image1.png",
      "sharpness": 27.794309310423316,
      "transform_matrix": [
        [0.861971378326416, -0.24816764891147614, 0.442061185836792, 0.5305169224739075],
        [0.5069567561149597, 0.42257794737815857, -0.751280665397644, -0.9016107320785522],
        [-0.0003617570619098842, 0.8716884255409241, 0.49006035923957825, 0.5878257155418396],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image2.png",
      "sharpness": 27.52245648391413,
      "transform_matrix": [
        [-0.008050278760492802, 0.34757664799690247, 0.9376169443130493, 1.1250077486038208],
        [0.9999675750732422, 0.0030544146429747343, 0.007453335449099541, 0.008942931890487671],
        [-0.00027326561394147575, 0.9376466274261475, -0.3475899398326874, -0.41733309626579285],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image3.png",
      "sharpness": 27.58521978950844,
      "transform_matrix": [
        [-0.8660249710083008, -0.2501213252544403, 0.4329434335231781, 0.5195764899253845],
        [0.4999733865261078, -0.42414551973342896, 0.7550675868988037, 0.9061554074287415],
        [-0.005227487534284592, 0.870367705821991, 0.49237459897994995, 0.5906028747558594],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image4.png",
      "sharpness": 27.95186379777407,
      "transform_matrix": [
        [-0.8700066804885864, -0.1741916537284851, -0.46124354004859924, -0.5534254312515259],
        [-0.4930136799812317, 0.2977108061313629, 0.8174996376037598, 0.9808858036994934],
        [-0.0050844247452914715, 0.9386296272277832, -0.34488925337791443, -0.4140926003456116],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image5.png",
      "sharpness": 27.08037798388144,
      "transform_matrix": [
        [-0.008049866184592247, 0.4948376417160034, -0.8689481019973755, -1.0428248643875122],
        [-0.9999675750732422, -0.003890044754371047, 0.007048365660011768, 0.008458686992526054],
        [0.00010754966933745891, 0.8689767122268677, 0.4948529005050659, 0.5935773849487305],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image6.png",
      "sharpness": 27.732116033649934,
      "transform_matrix": [
        [0.866024911403656, -0.17087915539741516, -0.4698948264122009, -0.5638080835342407],
        [-0.49997571110725403, -0.3053785562515259, -0.8104123473167419, -0.9723791480064392],
        [-0.005013220012187958, 0.9367733001708984, -0.3499008119106293, -0.4201057553291321],
        [0.0, 0.0, 0.0, 1.0]
      ]
    }
  ]
}

Be sure to replace the background in the pictures with transparency. enumeration

botaoye commented 8 months ago

Test showing what results can be achieved using only 7 images in Instant-NGP:

instant-ngp_.x.00176.mp4 instant-ngp.x.00178.mp4 instant-ngp.x.00179.mp4 instant-ngp.x._00180.mp4 Obviously, more images are needed to achieve better and more accurate results. But considering that this is only 7 images, the result is impressive.

Here are the contents of the "transforms.json" file that I got (In case anyone wants to try):

{
  "camera_angle_x": 0.03812197129202137,
  "camera_angle_y": 0.0140774200158162,
  "fl_x": 320.719333182715,
  "fl_y": 320.1815001140895,
  "k1": -0.01956388203737949,
  "k2": 0.110182871165539e-05,
  "p1": -0.05669583599777802,
  "p2": -0.0175356537000764e-05,
  "cx": 160.9954396730627,
  "cy": 160.2710416469586,
  "w": 320.0,
  "h": 320.0,
  "aabb_scale": 16,
  "scale": 2.7,
  "frames": [
      {
      "file_path": "./images/image0.png",
      "sharpness": 27.732116033649934,
      "transform_matrix": [
      [0.9999500513076782, 0.004810694605112076, 0.008763475343585014, 0.010515931993722916],
        [0.008762343786656857, 0.0002564578899182379, -0.9999614953994751, -1.1999539136886597],
        [-0.004812757018953562, 0.9999883770942688, 0.00021429210028145462, 0.00025705056032165885],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image1.png",
      "sharpness": 27.794309310423316,
      "transform_matrix": [
      [0.861971378326416, -0.24816764891147614, 0.442061185836792, 0.5305169224739075],
        [0.5069567561149597, 0.42257794737815857, -0.751280665397644, -0.9016107320785522],
        [-0.0003617570619098842, 0.8716884255409241, 0.49006035923957825, 0.5878257155418396],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image2.png",
      "sharpness": 27.52245648391413,
      "transform_matrix": [
      [-0.008050278760492802, 0.34757664799690247, 0.9376169443130493, 1.1250077486038208],
        [0.9999675750732422, 0.0030544146429747343, 0.007453335449099541, 0.008942931890487671],
        [-0.00027326561394147575, 0.9376466274261475, -0.3475899398326874, -0.41733309626579285],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image3.png",
      "sharpness": 27.58521978950844,
      "transform_matrix": [
      [-0.8660249710083008, -0.2501213252544403, 0.4329434335231781, 0.5195764899253845],
        [0.4999733865261078, -0.42414551973342896, 0.7550675868988037, 0.9061554074287415],
        [-0.005227487534284592, 0.870367705821991, 0.49237459897994995, 0.5906028747558594],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image4.png",
      "sharpness": 27.95186379777407,
      "transform_matrix": [
      [-0.8700066804885864, -0.1741916537284851, -0.46124354004859924, -0.5534254312515259],
        [-0.4930136799812317, 0.2977108061313629, 0.8174996376037598, 0.9808858036994934],
        [-0.0050844247452914715, 0.9386296272277832, -0.34488925337791443, -0.4140926003456116],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image5.png",
      "sharpness": 27.08037798388144,
      "transform_matrix": [
      [-0.008049866184592247, 0.4948376417160034, -0.8689481019973755, -1.0428248643875122],
        [-0.9999675750732422, -0.003890044754371047, 0.007048365660011768, 0.008458686992526054],
        [0.00010754966933745891, 0.8689767122268677, 0.4948529005050659, 0.5935773849487305],
        [0.0, 0.0, 0.0, 1.0]
      ]
    },
    {
      "file_path": "./images/image6.png",
      "sharpness": 27.732116033649934,
      "transform_matrix": [
      [0.866024911403656, -0.17087915539741516, -0.4698948264122009, -0.5638080835342407],
        [-0.49997571110725403, -0.3053785562515259, -0.8104123473167419, -0.9723791480064392],
        [-0.005013220012187958, 0.9367733001708984, -0.3499008119106293, -0.4201057553291321],
        [0.0, 0.0, 0.0, 1.0]
      ]
    }
  ]
}

Be sure to replace the background in the pictures with transparency. enumeration

Hi, how could you generate this "transforms.json" file, especially the camera intrinsic parameters?

Ratinod commented 8 months ago

Hi, how could you generate this "transforms.json" file, especially the camera intrinsic parameters?

In this particular case, this file was made partially by hand. I took one "transforms.json" file generated using colmap2nerf.py and started changing it. First I generated a "transform matrix" using blender following the instructions https://github.com/SUDO-AI-3D/zero123plus#camera-poses. I simply rotated the plane as was written in the link above and then got the matrix using this command: bpy.context.selected_objects[0].matrix_world

Well, then I began to select the appropriate remaining parameters ("fl_x", "fl_y", "cx", "cy", "w", "h", "scale"). Most likely they are not chosen perfectly, but they are enough for it to work.

bananaman1983 commented 6 months ago

try blendernerf. saves a lot of trouble. https://github.com/maximeraafat/BlenderNeRF