Closed tijiang13 closed 1 month ago
Hello @tijiang13 , Please refer to Table. 6 of the paper for the normal evaluation metrics. We compute angular error and % within t deg.
Cameras were randomly sampled between a fixed distance range - same set of images were used to evaluate all the methods. A perspective camera was used to generate the images from the THuman2.0 dataset.
Thanks, but I am wondering what the exact hyper parameter e.g. for the distribution of camera-object distance/focal length -- I might also need to evaluate my own method under a similar setting.
Best, Tianjian
Hi @rawalkhirodkar,
Just to add some additional context -- here is my current setting:
@dataclass
class RandomCameraConfig:
"""configuration for random camera pose generation
note:
1. azimuth and elevation are in degrees, camera_distance is in meters
2. the object is centered at the origin
3. the camera is looking at the origin
"""
# image resolution
height: int = 512
width: int = 512
# camera parameters (in degrees)
azimuth_range: tuple[float, float] = (-180, 180)
elevation_range: tuple[float, float] = (-10, 10)
camera_distance_range: tuple[float, float] = (2.0, 3.0)
# field of view
fov_range: tuple[float, float] = (30, 60)
This is the code we use to place a virtual camera.
# Set render resolution for 4:3 aspect ratio
bpy.context.scene.render.resolution_x = 1440
bpy.context.scene.render.resolution_y = 1920
def setup_random_camera(mesh_obj, mesh_dimensions, camera_mode):
# Create a new camera
cam_data = bpy.data.cameras.new('Camera')
cam_ob = bpy.data.objects.new('Camera', cam_data)
bpy.context.scene.collection.objects.link(cam_ob)
bpy.context.scene.camera = cam_ob # Set the created camera to be the active one
print('-----------mode:{}--------'.format(camera_mode))
# Calculate different parts of the human body
body_parts = {
'full_body': mesh_obj.location + Vector((0.0, 0.0, mesh_dimensions.z * 0.5)),
'face': mesh_obj.location + Vector((0.0, 0.0, mesh_dimensions.z * 0.85)),
'upper_half': mesh_obj.location + Vector((0.0, 0.0, mesh_dimensions.z * 0.75)),
}
# Determine the target position based on selected mode
target_position = body_parts[camera_mode]
# Sample the focal length appropriately
focal_lengths = {
'full_body': random.uniform(28, 50),
'face': random.uniform(85, 135),
'upper_half': random.uniform(50, 85),
}
cam_data.lens = focal_lengths[camera_mode]
# Define the distance of the camera from the target based on the mode
distances = {
'full_body': random.uniform(1.2, 2.0),
'face': random.uniform(1, 1.2),
'upper_half': random.uniform(1, 1.6),
}
distance = distances[camera_mode] + random.uniform(-0.1, 0.1) # Add noise to the distance
angle = random.uniform(0, 2 * math.pi)
height_angle = random.uniform(-math.pi / 6, math.pi / 6) # Introduce variability in height angle
# Calculate the camera location on a concentric circle around the target
camera_location = target_position + Vector((distance * math.cos(angle) * math.cos(height_angle),
distance * math.sin(angle) * math.cos(height_angle),
distance * math.sin(height_angle)))
# Adjust camera height based on mode
height_offsets = {
'full_body': mesh_dimensions.z * 0.1,
'face': mesh_dimensions.z * 0.65,
'upper_half': mesh_dimensions.z * 0.45,
}
camera_location.z += height_offsets[camera_mode] + random.uniform(-0.05, 0.05) * mesh_dimensions.z # Add noise
# Set camera location
cam_ob.location = camera_location
# Point camera to the target position
direction = target_position - camera_location
rot_quat = direction.to_track_quat('-Z', 'Y')
cam_ob.rotation_euler = rot_quat.to_euler()
# Add noise to the camera rotation
rotation_noise_angles = (0, 0, 0)
rotation_noise_euler = Euler(rotation_noise_angles, 'XYZ') # Create an Euler rotation from the angles
# Apply the rotation noise to the camera's current rotation
cam_ob.rotation_euler.rotate(rotation_noise_euler) # Apply the rotation noise directly, no assignment needed
return cam_ob.location, cam_ob.rotation_euler
##---------------------------------------------------------------------
def setup_camera(mesh_obj, camera_mode):
mesh_center, mesh_dimensions = get_mesh_center_and_dimensions(mesh_obj)
loc, rot = setup_random_camera(mesh_obj, mesh_dimensions, camera_mode)
return loc, rot
Thanks a lot! -- Best, Tianjian
Hello,
Thanks for the great work! Could you share a bit of more details on how the surface normals evaluated in THuman 2.0? I read your reply in other issues and it looks that the cosine similarity between normals in camera coordinate system is measured.
But how were the cameras sampled? I guess the distance between the camera and object can play a huge factor here (c.f. weak perspective camera) especially as Sapiens has no prior knowledge over the camera's intrinsics during the inference?
Best, Tianjian