YvanYin / Metric3D

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."
https://jugghm.github.io/Metric3Dv2/
BSD 2-Clause "Simplified" License
1.06k stars 73 forks source link

Calculate the distance from monocular RGB camera to object #107

Open STRIVESS opened 1 month ago

STRIVESS commented 1 month ago

Hello! Thank you for your excellent open-source projects. I have a project recently where I need to implement a grass-cutting robot that can recognize and avoid obstacles using vision. I'm using a monocular RGB camera and plan to perform camera calibration using OpenCV and a chessboard pattern to obtain the camera's intrinsic matrix. I have previously worked with the Depth-Anything algorithm, which allows me to obtain depth values for each pixel in the captured frames. I'm also considering trying the Metric3D algorithm. Once I obtain the intrinsic matrix, could you please guide me on how to calculate the actual distance from the camera to an object? Thanks for your help!

cSpiWApqDJ gLLwBxjcTR

I want to use below codes to calculate the camera's intrinsic matrix

import cv2
import numpy as np
import glob

chessboard_size = (9, 6)
square_size = 1.0  

objp = np.zeros((chessboard_size[0] * chessboard_size[1], 3), np.float32)
objp[:, :2] = np.mgrid[0:chessboard_size[0], 0:chessboard_size[1]].T.reshape(-1, 2)
objp *= square_size

objpoints = []
imgpoints = []

images = glob.glob('calibration_images/*.jpg')

for image_file in images:
    img = cv2.imread(image_file)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    ret, corners = cv2.findChessboardCorners(gray, chessboard_size, None)

    if ret:
        objpoints.append(objp)
        imgpoints.append(corners)

        cv2.drawChessboardCorners(img, chessboard_size, corners, ret)
        cv2.imshow('Chessboard Corners', img)
        cv2.waitKey(500)

cv2.destroyAllWindows()

ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)

print("Camera matrix:\n", mtx)
print("Distortion coefficients:\n", dist)
np.savez('calibration_data.npz', camera_matrix=mtx, dist_coeffs=dist, rvecs=rvecs, tvecs=tvecs)
Owen-Liuyuxuan commented 1 month ago

I believe you could try the code at https://github.com/YvanYin/Metric3D/blob/main/hubconf.py#L174 .

canonical_to_real_scale = intrinsic[0] / 1000.0 # 1000.0 is the focal length of canonical camera
pred_depth = pred_depth * canonical_to_real_scale # now the depth is metric
pred_depth = torch.clamp(pred_depth, 0, 300)
STRIVESS commented 1 month ago

I believe you could try the code at https://github.com/YvanYin/Metric3D/blob/main/hubconf.py#L174 .

canonical_to_real_scale = intrinsic[0] / 1000.0 # 1000.0 is the focal length of canonical camera
pred_depth = pred_depth * canonical_to_real_scale # now the depth is metric
pred_depth = torch.clamp(pred_depth, 0, 300)

Thanks for your help! I try it again.