modelscope / richdreamer

Live Demo:https://modelscope.cn/studios/Damo_XR_Lab/3D_AIGC
https://aigc3d.github.io/richdreamer/
Apache License 2.0
362 stars 13 forks source link

camera pose format #9

Closed apchenstu closed 5 months ago

apchenstu commented 5 months ago

Hi folks, thank you very much for your hard work in making this dataset available! I'm excited to try it out. The format of the camera poses is not clear to me. Could you please provide a description of them, for example, xxx.json? Are they in OpenCV or Blender coordinate systems? And how to convert them the extrinsic given the scale and offset. Would be the best if the authors could provide a data loader for it.

lingtengqiu commented 5 months ago

The 3D coordinate definition is very complex. it is hard for us to say what the camera system used. Fortunately, the target we want is that mapping the normal to NormalBae system. As the following figure illustrates:

9d92b48cd10281fd4d20777100232a34

Where U and V denotes the image space width-axis and heigh-axis (OpenCV-system), xyz means the camera view coordinate

lingtengqiu commented 5 months ago

The Objaverse rendering has one mainstream system, blender based system

For the blender-based system, the rendering code can be found at Zero123, or MVSObajverse

Given blender-based system we can use the following code to transfer blender world space to Normal-BAE camera space.

def blender2midas(img):
    '''Blender: rub
    midas: lub
    '''
    img[...,0] = -img[...,0]
    img[...,1] = -img[...,1]
    img[...,-1] = -img[...,-1]
    return img

def read_camera_matrix_single(json_file):
    with open(json_file, 'r', encoding='utf8') as reader:
        json_content = json.load(reader)

    '''
    camera_matrix = np.eye(4)
    camera_matrix[:3, 0] = np.array(json_content['x'])
    camera_matrix[:3, 1] = -np.array(json_content['y'])
    camera_matrix[:3, 2] = -np.array(json_content['z'])
    camera_matrix[:3, 3] = np.array(json_content['origin'])
    '''

    # suppose is true
    camera_matrix = np.eye(4)
    camera_matrix[:3, 0] = np.array(json_content['x'])
    camera_matrix[:3, 1] = np.array(json_content['y'])
    camera_matrix[:3, 2] = np.array(json_content['z'])
    camera_matrix[:3, 3] = np.array(json_content['origin'])

    return camera_matrix
  normal = cv2.imread(normal_path)
  # to xyz channel
  normal = normal[..., ::-1]
  world_normal = (normal.astype(np.float32)/255. * 2.) - 1

  cond_c2w = read_camera_matrix_single(camera_json)
  # identity map
  view_cn = blender2midas(world_normal@ (cond_c2w[:3,:3]))

.

lingtengqiu commented 5 months ago

However, our rendering system is defined at Unity-based system.

Here's a question, how do you plug in blender's coordinate system directly without introducing a new coordinate system?

we maintain world to camera transfer matrix as blender setting, transfer Unity-based system to blender-based system.

Then we get the final mapping codes.

def read_camera_matrix_single(json_file):

    with open(json_file, 'r', encoding='utf8') as reader:
        json_content = json.load(reader)

    '''
    camera_matrix = np.eye(4)
    camera_matrix[:3, 0] = np.array(json_content['x'])
    camera_matrix[:3, 1] = -np.array(json_content['y'])
    camera_matrix[:3, 2] = -np.array(json_content['z'])
    camera_matrix[:3, 3] = np.array(json_content['origin'])
    '''

    # suppose is true
    camera_matrix = np.eye(4)
    camera_matrix[:3, 0] = np.array(json_content['x'])
    camera_matrix[:3, 1] = np.array(json_content['y'])
    camera_matrix[:3, 2] = np.array(json_content['z'])
    camera_matrix[:3, 3] = np.array(json_content['origin'])

    return camera_matrix

def unity2blender(normal):
    normal_clone = normal.copy()
    normal_clone[...,0] = -normal[...,-1]
    normal_clone[...,1] = -normal[...,0]
    normal_clone[...,2] = normal[...,1]

    return normal_clone

def blender2midas(img):
    '''Blender: rub
    midas: lub
    '''
    img[...,0] = -img[...,0]
    img[...,1] = -img[...,1]
    img[...,-1] = -img[...,-1]
    return img

  normal = cv2.imread(normal_path)
  # to xyz channel
  normal = normal[..., ::-1]
  world_normal = (normal.astype(np.float32)/255. * 2.) - 1
  world_normal = unity2blender(world_normal)

  cond_c2w = read_camera_matrix_single(camera_json)
  # identity map
  view_cn = blender2midas(world_normal@ (cond_c2w[:3,:3]))
lingtengqiu commented 5 months ago

We provide an example code to visualize the coordinate system transferring from world-space to Normal-bae system. seeing; https://github.com/modelscope/richdreamer/tree/main/dataset/gobjaverse

apchenstu commented 5 months ago

many thanks for the quick response, now it works for me ;)