jdibenes / hl2ss

HoloLens 2 Sensor Streaming. Real-time streaming of HoloLens 2 sensor data over WiFi. Research Mode and External USB-C A/V supported.
Other
219 stars 53 forks source link

Inconsistency in PV camera to world transformation matrix calculation #134

Closed zhangzhousuper closed 2 months ago

zhangzhousuper commented 3 months ago

Hi,

I'm currently working on a project that involves streaming the PV (PhotoVideo) camera and utilizing the PV-to-world transformation matrix. I've encountered some inconsistencies and would like to seek clarification and guidance.

What I decided to get pv2world matrix is pv2world = pv2rig @ rig2world //which i think is pv.extrinsic @ pv.pose The resulting pv2world matrix differs from what I've observed in the HoloLens2ForCV project.

And I try to calculate pv2world matrix by using ahat camera pv2world = pv2rig @ rig2world // pv.extrinsic @ ahat.pose This approach produces a pv2world matrix that aligns with my expectations.

Current Limitation: The workaround requires running both the PV and AHAT camera streams simultaneously, which is not ideal for performance and resource management. Questions:

Is there a recommended way to obtain the correct PV-to-world transformation matrix without relying on the AHAT camera? If using the AHAT camera is necessary, are there any best practices for efficiently managing multiple camera streams? Is there any documentation or explanation for why the PV camera's pose might not be suitable for this calculation in some scenarios?

jdibenes commented 3 months ago

Hi, pv.pose is the PV to world transform.

zhangzhousuper commented 3 months ago

image

Hi, I initially used pv.pose as PV2world, expecting it to be similar to the PV2world matrix calculation in the HoloLens2ForCV repository. The output from using pv.pose directly differs significantly from the results I obtained when using the HoloLens2ForCV method. And like what I said, when I use pv.extrinsic @ ahat.pose, the result matrix is what I want. It's really weird :(. I have no idea about which part is wrong. Could you please provide some guidance on how to debug this issue?

jdibenes commented 3 months ago

I compared pv.pose and pv.extrinsics @ ahat.pose and obtained pretty much the same result, so pv.pose should work. As an example: pv.pose

[[-0.3963493  -0.0150646  -0.9179779   0.        ]
 [-0.15314528  0.98694265  0.04992544  0.        ]
 [ 0.90524     0.16037175 -0.39348197  0.        ]
 [ 0.14384635 -0.6248661  -0.46434924  1.        ]]

pv.extrinsics @ ahat.pose

[[-0.39634746 -0.01506643 -0.91797686  0.        ]
 [-0.15315571  0.9869398   0.0499287   0.        ]
 [ 0.9052355   0.16038242 -0.39347872  0.        ]
 [ 0.14383185 -0.6248684  -0.46432602  1.        ]]

Difference (pv.pose @ inv(pv.extrinsics @ ahat.pose))

[[ 1.00000167e+00  2.04518437e-06 -9.53674316e-07  0.00000000e+00]
 [-1.19209290e-06  1.00000107e+00  1.11907721e-05  0.00000000e+00]
 [ 1.40070915e-06 -1.13826245e-05  1.00000358e+00  0.00000000e+00]
 [ 1.55270100e-05 -1.07288361e-06  2.26199627e-05  1.00000000e+00]]

I used the following script to test. Try running it, you should get very similar poses. If not, there may be something wrong with your setup. Also, note that pv.extrinsics @ ahat.pose is only valid if the hololens is static.


import numpy as np
import multiprocessing as mp
import cv2
import hl2ss
import hl2ss_lnm
import hl2ss_mp
import hl2ss_3dcv

# Settings --------------------------------------------------------------------

# HoloLens address
host = '192.168.1.7'

# Calibration path (must exist but can be empty)
calibration_path = '../calibration'

# Front RGB camera parameters
pv_width = 640
pv_height = 360
pv_fps = 30

# Buffer length in seconds
buffer_size = 10

#------------------------------------------------------------------------------

if __name__ == '__main__':
    # Start PV Subsystem ------------------------------------------------------
    hl2ss_lnm.start_subsystem_pv(host, hl2ss.StreamPort.PERSONAL_VIDEO)

    # Get RM Depth AHAT calibration -------------------------------------------
    # Calibration data will be downloaded if it's not in the calibration folder
    calibration_ht = hl2ss_3dcv.get_calibration_rm(host, hl2ss.StreamPort.RM_DEPTH_AHAT, calibration_path)
    calibration_pv = hl2ss_3dcv.get_calibration_pv(host, hl2ss.StreamPort.PERSONAL_VIDEO, calibration_path, 1000, pv_width, pv_height, pv_fps)

    # Start PV and RM Depth AHAT streams --------------------------------------
    producer = hl2ss_mp.producer()
    producer.configure(hl2ss.StreamPort.PERSONAL_VIDEO, hl2ss_lnm.rx_pv(host, hl2ss.StreamPort.PERSONAL_VIDEO, width=pv_width, height=pv_height, framerate=pv_fps))
    producer.configure(hl2ss.StreamPort.RM_DEPTH_AHAT, hl2ss_lnm.rx_rm_depth_ahat(host, hl2ss.StreamPort.RM_DEPTH_AHAT))
    producer.initialize(hl2ss.StreamPort.PERSONAL_VIDEO, pv_fps * buffer_size)
    producer.initialize(hl2ss.StreamPort.RM_DEPTH_AHAT, hl2ss.Parameters_RM_DEPTH_AHAT.FPS * buffer_size)
    producer.start(hl2ss.StreamPort.PERSONAL_VIDEO)
    producer.start(hl2ss.StreamPort.RM_DEPTH_AHAT)

    consumer = hl2ss_mp.consumer()
    manager = mp.Manager()
    sink_pv = consumer.create_sink(producer, hl2ss.StreamPort.PERSONAL_VIDEO, manager, None)
    sink_ht = consumer.create_sink(producer, hl2ss.StreamPort.RM_DEPTH_AHAT, manager, None)

    sink_pv.get_attach_response()
    sink_ht.get_attach_response()

    cv2.namedWindow('control')

    while (True):
        if ((cv2.waitKey(500) & 0xFF) == 27): # esc to stop
            break

        _, data_ht = sink_ht.get_most_recent_frame()
        if ((data_ht is None) or (not hl2ss.is_valid_pose(data_ht.pose))):
            continue

        _, data_pv = sink_pv.get_nearest(data_ht.timestamp)
        if ((data_pv is None) or (not hl2ss.is_valid_pose(data_pv.pose))):
            continue

        print('-------POSES-----------')
        # PV to WORLD
        pv_to_world = hl2ss_3dcv.reference_to_world(data_pv.pose)
        print(pv_to_world)
        # PV to RIGNODE then RIGNODE to WORLD
        pv_to_rignode_to_world = hl2ss_3dcv.camera_to_rignode(calibration_pv.extrinsics) @ hl2ss_3dcv.reference_to_world(data_ht.pose)
        print(pv_to_rignode_to_world)
        # Difference
        print(pv_to_world @ np.linalg.inv(pv_to_rignode_to_world))

    # Stop PV and RM Depth AHAT streams ---------------------------------------
    sink_pv.detach()
    sink_ht.detach()
    producer.stop(hl2ss.StreamPort.PERSONAL_VIDEO)
    producer.stop(hl2ss.StreamPort.RM_DEPTH_AHAT)

    # Stop PV subsystem -------------------------------------------------------
    hl2ss_lnm.stop_subsystem_pv(host, hl2ss.StreamPort.PERSONAL_VIDEO)