Open oriki101 opened 4 years ago
Hi @Kazuaki9817
please can you post your script or refer to one of the examples? Are you using WiFi or eth connection? How is the live view with VLC, is it still delayed?
We are experiencing the same issue using either WiFi or Ethernet connection. It can be reproduced with the live_scene_and_gaze.py example. Whenever you fixate one point while moving your head to one side, the gaze point in the Livestream begins to move in the opposite direction for about 0.5-1 second, instead of staying at this point. This does not happen during a Livestream in the Glasses Controller Software from Tobii.
Seemingly, gaze data and video data are not synchronized correctly. Gaze data appears to arrive a little bit earlier than video data, which causes the gaze point in the Livestream to drift during head movements.
Hi @FranzAlbers ,
I didn't experience yet this issue. I only have some freeze sometime using wifi connection. Which operating system are you using?
Hey @ddetommaso,
we originally used this software and experienced the issue in our driving simulator with Windows, but I can reproduce it also on my workstation using ubuntu (with both Python 2.7 and 3.6). Besides, thanks for publishing this useful package!
I believe the issue is actually rather part of the live_scene_and_gaze.py in the Examples, where the offset value (in seconds) is compared to the frame_duration (which is given in milliseconds) which results in incorrectly displayed gaze data:
frame_duration = 1000.0/float(video_freq) #frame duration in ms
...
offset = data_gp['ts']/1000000.0 - data_pts['ts']/1000000.0
if offset > 0.0 and offset <= frame_duration:`
cv2.circle(frame,(int(data_gp['gp'][0]*width),int(data_gp['gp'][1]*height)), 30, (0,0,255), 2)
The gaze data is therefore currently (almost) always just added to the current frame, but not necessarily to the correct frame.
Hi @FranzAlbers,
thank you for the feedback. What you're saying is correct. At the moment is not clear how to synch video and gaze in the live view using openCV. So I cannot close the issue right now.
Please let me know if you guys find a solution for that!
I think you might need to create a queue to buffer video and gaze data, and then utilize the offset to sync them as the example code provided by Tobii does. Just a thought, still trying to figure out how to realize.
Hi @minhanli ,
I did check the example provided in the API but I am still unable to make it works with Opencv, cause they're using gstreamer. I am not able to extract the timestamp of the mpeg flow using OpenCV, so even if we have buffers of video and gaze data, I think they're pretty useless without both timestamps.
Hi @ddetommaso,
I read the manual, and it says something about synchronization like below:
Did you try this? It seems like we only need to know the pts of the video, instead if ts.
Hi @minhanli
I was referring to this documentation just the comment above. I do not have the ts of the videostream. With OpenCv it seems I can access the videostream only frame by frame.
@ddetommaso You are right. We seem to have no way to access the timestamp info of the video thru the OpenCv, as the OpenCv discarded them way before. I'm thinking of using other packages to decode the video while keeping the timestamp, such as pyav. Not using gstreamer, because it is a hassle to deal with the installation.
By using pyav instead of opencv for grabbing the video, the delay is almost not perceivable. One example with pyav can be found in the following code:
import cv2
import av
import numpy as np
from tobiiglassesctrl import TobiiGlassesController
tobiiglasses = TobiiGlassesController(video_scene=True)
ipv4_address = tobiiglasses.get_address()
tobiiglasses.start_streaming()
rtsp_url = "rtsp://%s:8554/live/scene" % ipv4_address
container = av.open(rtsp_url, options={'rtsp_transport': 'tcp'})
stream = container.streams.video[0]
for frame in container.decode(stream):
data_gp = tobiiglasses.get_data()['gp']
data_pts = tobiiglasses.get_data()['pts']
frame_cv = frame.to_ndarray(format='bgr24')
if data_gp['ts'] > 0 and data_pts['ts'] > 0:
# offset = data_gp['ts'] / 1000.0 - data_pts['ts'] / 1000.0 # in milliseconds
# print('Frame_pts = %f' % float(frame.pts))
# print('Frame_time = %f' % float(frame.time))
# print('Data_pts = %f' % float(data_pts['pts']))
# print('Offset = %f' % float(offset))
# Overlay gazepoint
height, width = frame_cv.shape[:2]
cv2.circle(frame_cv, (int(data_gp['gp'][0] * width), int(data_gp['gp'][1] * height)), 20, (0, 0, 255), 6)
# Display Stream
cv2.imshow("Livestream", frame_cv)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
tobiiglasses.stop_streaming()
tobiiglasses.close()
However, this is of course still not correctly synchronized. The better way would probably be to add something like the BufferSync Class from the Tobii Pro Glasses API Examples (video_with_gaze.py) to the TobiiGlassesPyController, buffer gaze data and return the synchronized gaze data based on the video pts we get from pyav.
Hi @FranzAlbers,
thank you for your work! It's definitively worth to modify the controller with the BufferSync as you suggested. I'll work on it in the next days.
Stay tuned!
Hi, did you find a solution to that? I'm currently working on the same thing. I have tried with Gstreamer but couldn't even get the pipeline to work.
Hi @HansenLars
unfortunately I don't have a solution right now. I think the issue might be solved using OpenCV grabber. I saw that it is actually possible to access timestamp of a current frame (CAP_PROP_POS_MSEC in cv::VideoCaptureProperties) https://docs.opencv.org/4.5.0/d4/d15/groupvideoioflags__base.html
Hi @ddetommaso
I am also working to sync live gaze data with the video. As far as I know, we should ensure that pts of video = pts of gaze for synchronization. But at the moment, I don't know how to calculate the pts_video. Also, can u elaborate on how to sync when we use the timestamp of the current frame? Could you please guide me regarding this?
Also, do you have a solution for syncing the gaze and video data?
Hi @lokkeshvertk
as far I understand a solution could be to estimate the pts_video information from the one of the property of the opencv grabber (maybe CAP_PROP_POS_MSEC). You can start from this example and see if you can synchronize video and gaze properly with this add.
Hi @ddetommaso,
I have been working using that example as a base. I also used _CAP_PROP_POSMSEC to calculate the timestamps of the video and tried syncing it but i couldn't. It would be really helpful if you could elaborate on the method to sync using the _CAP_PROP_POSMSEC property.
Hi @lokkeshvertk
the proper way to synchronize the video and the data streams would be to use pts (presentation timestamp) sent together with the MPEG video stream as explained in the API. Unfortunately we cannot access pts from the opencv grabber, so we should find another way to do that.
I might be wrong but we can consider that pts and CAP_PROP_POS_MSEC hold the same information. So, I think we need:
A function that stores eye-tracking data in a buffer as suggested by @minhanli (e.g in a Python dictionary to store tobiiglasses.get_data()['gp'] indexed with timestamp) DATA_BUFFER
A function that stores pts packets in a buffer (e.g. a Python dictionary to store tobiiglasses.get_data()['pts'] indexed with pts) PTS_BUFFER
A function to get the closest eye-tracking data based on the CAP_PROP_POS_MSEC. Specifically, this function would take as input the CAP_PROP_POS_MSEC parameter and return the closest gaze data. The CAP_PROP_POS_MSEC is transformed into an estimated presentation timestamp EPTS ( EPTS = CAP_PROP_POS_MSEC * 90 ), then we get from the PTS_BUFFER the closest pts to the EPTS and its relative ts (ETS) . Finally we can get from DATA_BUFFER the closest gaze data based on the ETS.
Hi @ddetommaso,
Thanks a lot for elaborating on the method. I shall try and let you know! Thanks again!
Hello, there was an error when I used rtsp to link the IPV6 address.I shall be glad to receive your reply
Hello, thanks for sharing the codes for controlling tobii glasses! I am trying to get a map between videos and gaze data online, but I am getting a map with a delayed image and a real-time gaze data. So I would be happy if you could tell me the way to get an online map between videos and gaze data correctly. Thanks in advance!