AR-Eye-Tracking-Toolkit / ARETT

ARETT: Augmented Reality Eye Tracking Toolkit for Head Mounted Displays
MIT License
19 stars 15 forks source link

Question on the Webcam gaze point #8

Closed serhan-gul closed 1 year ago

serhan-gul commented 2 years ago

Hi, I have two questions regarding the WebcamCamera:

  1. WebCamRenderTexture assigned to WebcamCamera has a size of 960x540. However, the videos recorded from HL2 camera can be either 1920x1080 or 768x432, depending on the setup in Device Portal (see below). Should WebCamRenderTexture be given the resolution set in Device Portal?
Device portal video settings
  1. In the documentation, you recommend to check the configuration per use case and to adapt it, if the camera specifications differ. I guess this means adapting the projectionMatrix as well as localPosition and localRotation in WebcamCamera.cs. How can these values be determined for a specific HoloLens 2 device? Is there a calibration process for it?
sekapp commented 2 years ago

Hi, 1) Yes, it should match the recording resolution. In my case we were using a locked camera mode for WebRTC streaming of the image (publication), which resulted in the 960x540 resolution. However, the resolution should primairly change the gaze coordinates reported in the csv. If it is set wrong, you should be able to recalculate the screen coordinates to match your resolution. 2) Yes, this note in the documentation refers to these variables. The variables set in the toolkit were partially "reverse-engineered" from information provided by the HoloLens (see the locatable camera API) as well as experimentation/trial-and-error to match the webcam data to a unity in-game camera. Sadly there is no easy or official way to check or calibrate the values. The easiest way could be to record an MRC capture of the scene with visible gaze and check if (after synchronisation) the visible gaze point matches the coordinates provided in the csv. During development of the toolkit the calibration process was complicated further, as Unity had many bugs and issues with the locatable camera API. Hopefully the situation with Unity 2020 has improved since then, however I personally do not know the current state of the API implementation. In general I would expect the physical camera to be the same in all HL2. However, looking at the issues regarding the position of the fixation grid in Unity 2020, there might be some position changes needed as the webcam camera is also attachd to the main camera position.

serhan-gul commented 2 years ago

Thanks for the info and the links! Another question: Is there anything specific that should be configured to get a visible gaze point in the MRC capture? I started a recording session using the web interface, then started recording a video on HL2 but no luck. Tried both with Unity 2019.4 and 2020.3 but didn't get the gaze points in the MRC captures.

In the csv files, I see the gaze hits and mapping to the webcam points though, so it seems to be a problem with the visualization.

sekapp commented 2 years ago

I have seen some reports that even when all layers are configured as described in the wiki, gaze visualization in the MRC did not work. Currently my best guess is a change in the way MRC works in an windows update since me publishing the toolkit. Looking at my notes, I used Windows version 19041.1144.arm64fre.vb_release_svc_sydney_rel_prod.210405-1628 for my tests and studies. Maybe you can check you version and see if it differs. However, I am pretty certain you have a newer version as multiple updates should have been released since then. The issue is that MRC ideally should display exactly what the wearer of the device is seeing. However, during my testing I was able to use a specific layer and camera configuration combined with the “render from pv camera” option from MRC to display the gaze only in a recording while being invisible to the wearer. As this could be seen as bug (when the effect is not intended), it is possible that Microsoft “fixed” it and with that broke my workaround/approach in ARETT. Sadly, if this is the case it means that either the gaze point has to always be visible to the wearer – which is only acceptable for debugging purposes – or the gaze position has to be overlayed separately onto a MRC recording using the data from the csv.