Closed thucz closed 6 days ago
This problem makes it difficult to produce a good result for the Novel View Synthesis task on this dataset. In the rightmost column, the existing Gaussian Splatting methods may easily produce strange dark borders on this dataset.
Hello!
ASE is designed to produce accurate simulations of Aria output. The RGB camera has a very small fisheye lens on it. This means that we also simulate the Vignette of the Cameras. As you might be aware, most fisheye lenses produce a pronounced variation of brightness, more information can be found Here: (non affiliated link!)
The good news is that the variance in brightness is static. It could be reduced by creating a gradient (by inverting an image similar to this and multiplying it with the ASE image.
I hope that helps!
Thanks for your reply! However, I still do not know how to compute this gradient in ASE data accurately.
Hello!
ASE is designed to produce accurate simulations of Aria output. The RGB camera has a very small fisheye lens on it. This means that we also simulate the Vignette of the Cameras. As you might be aware, most fisheye lenses produce a pronounced variation of brightness, more information can be found Here: (non affiliated link!)
The good news is that the variance in brightness is static. It could be reduced by creating a gradient (by inverting an image similar to this and multiplying it with the ASE image.
I hope that helps! Hi! Is there any specific method to compute this relative illumination value of each pixel? I'm not familiar with fisheye cameras.
let me see if I can generate a gradient, standby!
Hi! Do you have any clue about relative illumination computation?
We calculate the distortion and then apply a Vignette, so the relative illumination is a function of combining a "normal" but distorted image with a vignette image.
This should re-flatten the lens based roll off.
The top right of the image is "up" so depending on how its applied you might need to rotate it to line it up.
Thanks for your help. I previously wrote the code about preprocessing ASE fisheye data including undistort and rotation. I beg your help. Which line code should I revise to change the relative illumination?
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import numpy as np
import plotly.graph_objects as go
from pathlib import Path
import os
from PIL import Image
from scipy.spatial.transform import Rotation as R
from projectaria_tools.projects import ase
from projectaria_tools.core import data_provider, calibration
from projectaria_tools.core.image import InterpolationMethod
from readers import read_points_file, read_trajectory_file, read_language_file
import cv2
from tqdm import tqdm
import os, sys, json
from multiprocessing import Pool
def distance_to_depth(K, dist, uv=None):
if uv is None and len(dist.shape) >= 2:
# create mesh grid according to d
uv = np.stack(np.meshgrid(np.arange(dist.shape[1]), np.arange(dist.shape[0])), -1)
uv = uv.reshape(-1, 2)
dist = dist.reshape(-1)
if not isinstance(dist, np.ndarray):
import torch
uv = torch.from_numpy(uv).to(dist)
if isinstance(dist, np.ndarray):
# z * np.sqrt(x_temp**2+y_temp**2+z_temp**2) = dist
uvh = np.concatenate([uv, np.ones((len(uv), 1))], -1)
uvh = uvh.T # N, 3
temp_point = np.linalg.inv(K) @ uvh # 3, N
temp_point = temp_point.T # N, 3
z = dist / np.linalg.norm(temp_point, axis=1)
else:
uvh = torch.cat([uv, torch.ones(len(uv), 1).to(uv)], -1)
temp_point = torch.inverse(K) @ uvh
z = dist / torch.linalg.norm(temp_point, dim=1)
return z
def transform_3d_points(transform, points):
N = len(points)
points_h = np.concatenate([points, np.ones((N, 1))], axis=1)
transformed_points_h = (transform @ points_h.T).T
transformed_points = transformed_points_h[:, :-1]
return transformed_points
def aria_export_to_scannet(scene_id):
src_folder = Path("/group/40033/public_datasets/3d_datasets/aria/ase_data/"+str(scene_id))
trgt_folder = Path("/group/40033/public_datasets/3d_datasets/aria/ase_preprocessed_data/"+str(scene_id))
trgt_folder.mkdir(parents=True, exist_ok=True)
SCENE_ID = src_folder.stem
print("SCENE_ID:", SCENE_ID)
scene_max_depth = 0
scene_min_depth = np.inf
Path(trgt_folder, "intrinsic").mkdir(exist_ok=True)
Path(trgt_folder, "pose").mkdir(exist_ok=True)
Path(trgt_folder, "depth").mkdir(exist_ok=True)
Path(trgt_folder, "color").mkdir(exist_ok=True)
rgb_dir = src_folder / "rgb"
depth_dir = src_folder / "depth"
# Load camera calibration
device = ase.get_ase_rgb_calibration()
# Load the trajectory using read_trajectory_file()
trajectory_path = src_folder / "trajectory.csv"
trajectory = read_trajectory_file(trajectory_path)
num_frames = len(list(rgb_dir.glob("*.jpg")))
Path('./debug').mkdir(exist_ok=True)
for frame_idx in range(num_frames):
frame_id = str(frame_idx).zfill(7)
rgb_path = rgb_dir / f"vignette{frame_id}.jpg"
depth_path = depth_dir / f"depth{frame_id}.png"
depth = Image.open(depth_path) # uint16
rgb = cv2.imread(str(rgb_path), cv2.IMREAD_UNCHANGED)
depth = np.array(depth)
scene_min_depth = min(depth.min(), scene_min_depth)
inf_value = np.iinfo(np.array(depth).dtype).max
depth[depth == inf_value] = 0 # consider it as invalid, inplace with 0
T_world_from_device = trajectory["Ts_world_from_device"][frame_idx] # camera-to-world
assert device.get_image_size()[0] == 704
# https://facebookresearch.github.io/projectaria_tools/docs/data_utilities/advanced_code_snippets/image_utilities
pinhole = calibration.get_linear_camera_calibration(
# device.get_image_size()[0],
# device.get_image_size()[1],
# device.get_focal_lengths()[0],
512,
512,
150,
"camera-rgb",
device.get_transform_device_camera() # important to get correct transformation matrix in pinhole_cw90
)
# distort image
rectified_rgb = calibration.distort_by_calibration(np.array(rgb), pinhole, device, InterpolationMethod.BILINEAR)
# raw_image = np.array(depth) # Will not work
depth = np.array(depth).astype(np.float32) # WILL WORK
rectified_depth = calibration.distort_by_calibration(depth, pinhole, device)
rotated_image = np.rot90(rectified_rgb, k=3)
rotated_depth = np.rot90(rectified_depth, k=3)
increase_light = True
if increase_light:
rotated_image = cv2.cvtColor(rotated_image,cv2.COLOR_BGR2HSV)
h,s,v = cv2.split(rotated_image)
v1 = np.clip(cv2.add(1*v, 30), 0, 255)
rotated_image = np.uint8(cv2.merge((h,s,v1)))
rotated_image = cv2.cvtColor(rotated_image,cv2.COLOR_HSV2BGR)
cv2.imwrite(str(Path(trgt_folder, "color", f"{frame_id}.jpg")), rotated_image)
# TODO: check this
plt.imsave(Path(f"./debug/debug_undistort_{frame_id}.png"), np.uint16(rotated_depth), cmap="plasma")
# Get rotated image calibration
pinhole_cw90 = calibration.rotate_camera_calib_cw90deg(pinhole)
principal = pinhole_cw90.get_principal_point()
cx, cy = principal[0], principal[1]
focal_lengths = pinhole_cw90.get_focal_lengths()
fx, fy = focal_lengths
K = np.array([ # camera-to-pixel
[fx, 0, cx],
[0, fy, cy],
[0, 0, 1.0]])
c2w = T_world_from_device
c2w_rotation = pinhole_cw90.get_transform_device_camera().to_matrix()
c2w_final = c2w @ c2w_rotation # right-matmul!
cam2world = c2w_final
# distance-to-depth
rotated_depth = distance_to_depth(K, rotated_depth).reshape((rotated_depth.shape[0], rotated_depth.shape[1]))#.reshape((dpt.shape[0], dpt.shape[1]))
rotated_depth = np.uint16(rotated_depth)
cv2.imwrite(str(Path(trgt_folder, "depth", f"{frame_id}.png")), rotated_depth) # cmap="gray", vmin=0, vmax=255
scene_max_depth = max(scene_max_depth, float(depth.max()))
Path(trgt_folder, "min_depth.txt").write_text(f"{scene_min_depth * 1.0 / 1000}")
Path(trgt_folder, "max_depth.txt").write_text(f"{scene_max_depth * 1.0 / 1000}")
Path(trgt_folder, "intrinsic", "intrinsic_color.txt").write_text(f"""{K[0][0]} {K[0][1]} {K[0][2]} 0.00\n{K[1][0]} {K[1][1]} {K[1][2]} 0.00\n{K[2][0]} {K[2][1]} {K[2][2]} 0.00\n0.00 0.00 0.00 1.00""")
Path(trgt_folder, "pose", f"{frame_id}.txt").write_text(f"""{cam2world[0, 0]} {cam2world[0, 1]} {cam2world[0, 2]} {cam2world[0, 3]}\n{cam2world[1, 0]} {cam2world[1, 1]} {cam2world[1, 2]} {cam2world[1, 3]}\n{cam2world[2, 0]} {cam2world[2, 1]} {cam2world[2, 2]} {cam2world[2, 3]}\n0.00 0.00 0.00 1.00""")
if __name__ == "__main__":
aria_export_to_scannet(scene_id=0)
multiply it together with the rgb
image just as its loaded:
rgb = cv2.imread(str(rgb_path), cv2.IMREAD_UNCHANGED)
anti_vignette = cv2.imread('path_to_anti_vignette.jpg')
rgb = cv2.multiply(rgb,anti_vignette,scale=1.0)
that should flatten it out. (again, i'm not sure of the rotation, so you might need to rotate the anit-vignette image left by 90 degrees for it to line up properly. )
You might end up with a white border instead of a black border, but that shouldn't be too hard to remove if needed (you can either crop or change the anti-vignette image I provided.)
Many thanks!
We calculate the distortion and then apply a Vignette, so the relative illumination is a function of combining a "normal" but distorted image with a vignette image.
This should re-flatten the lens based roll off.
The top right of the image is "up" so depending on how its applied you might need to rotate it to line it up.
Hi! it seems that this anti-vignette is normalized(min value is 0 and max value is 255 with data type np.uint8). I use it to change RGB images but the color overflows. Could you tell me how to reverse it to a true value?
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import numpy as np
# import plotly.graph_objects as go
from pathlib import Path
import os
from PIL import Image
import cv2
import os, sys, json
scene_id = 0
vignette_path = Path("/group/40033/public_datasets/3d_datasets/aria/data/anti_vignette.png")
anti_vignette = cv2.imread(str(vignette_path)) # , cv2.IMREAD_UNCHANGED
src_folder = Path("/group/40033/public_datasets/3d_datasets/aria/ase_data/"+str(scene_id))
rgb_dir = src_folder / "rgb"
frame_idx = 0
frame_id = str(frame_idx).zfill(7)
rgb_path = rgb_dir / f"vignette{frame_id}.jpg"
rgb = cv2.imread(str(rgb_path), cv2.IMREAD_UNCHANGED)
rgb = cv2.multiply(rgb, anti_vignette,scale=1.0)
cv2.imwrite("./debug.jpg", rgb)
Hi! @captain-sysadmin It seems that the given anti-vignette is normalized(min value is 0 and max value is 255 with data type np.uint8). If I use this anti-vignette to multiply RGB image, the RGB image will overflow (the value exceeds the data range of [0, 255]). Do you know how to resolve it?
Hi! Sorry to bother you again. Do you have any clue about this problem? It means a lot to me.
Hello!
As you can see, because we are multiplying white(or very near white) with another colour other than black, we quickly overflow and clip.
You can try using cv2.addWeighted instead of multiply. So:
rgb = cv2.multiply(rgb, anti_vignette,scale=1.0)
becomes
alpha = 1.0
beta = 1.0
gamma = 1.0
rgb = cv2.addWeighted(rgb, alpha, anti_vignette, beta, gamma)
changing the alpha and beta would allow you to alter the mix between the two images, and gamma should allow you to control clipping
Thanks for your reply! I can get the normal image.
But I still have a question. It seems the given code above will re-weight all three channels (RGB) with the anti-vignette image. So the image looks too bright.
I tried to re-weight the image in HSV space and revise only the V
value. The brightness becomes normal. But it will get a checkboard artifact near the edge. Do you know how to alleviate this problem?
rgb = rgb.astype(np.float32) / 255 # go to 32-bit float on 0..1
anti_vignette = anti_vignette.astype(np.float32) / 255
new_rgb = cv2.cvtColor(rgb,cv2.COLOR_BGR2HSV)
h,s,v = cv2.split(new_rgb)
new_v = cv2.addWeighted(v, alpha, anti_vignette[:, :, 0], beta, gamma)
new_rgb = cv2.merge((h,s,new_v))
new_rgb = cv2.cvtColor(new_rgb, cv2.COLOR_HSV2BGR)
new_rgb = np.uint8(np.clip(new_rgb*255, 0, 255))
rgb = new_rgb
I tried to re-weight the image in HSV space and revise only the V value. The brightness becomes normal. But it will get a checkboard artifact near the edge. Do you know how to alleviate this problem?
Off the top of my head you might be able to lower the V of vignette before adding it to the RGB image?
Each channel of anti_vignette is equal (R=G=B). So V=max(R, G, B) of vignette is vignette[:, :, 0].
anti_vignette = np.uint8(anti_vignette.astype(np.float32) * 1.0 / 3.0)
rgb = cv2.addWeighted(rgb, alpha, anti_vignette , beta, gamma)
The result is still a little white.
Please have a look at this https://github.com/facebookresearch/projectaria_tools/pull/125
Thanks!
Please have a look at this #125
Your dataset is really good! I'm trying to use ASE as my training data for the novel view synthesis task. But I found a problem: In the circle, the colors of the edge pixels (I'm not saying about the black border) are much darker than the center pixels(I have undistorted the images). So the corresponding pixels in the edges are not fully view-consistent between neighbor views. Do you know how to rectify this inconsistent brightness problem?