pupil-labs / pupil

Open source eye tracking
https://pupil-labs.com
GNU Lesser General Public License v3.0
1.48k stars 678 forks source link

Binocular 3D gaze depth (Z-value) #1393

Closed ABob53 closed 1 year ago

ABob53 commented 5 years ago

Hi,

I'm using the binocular headset (with the fisheye world camera), and I'm interested in 3D gaze point (intersection of 3D gaze vectors). I understood from the code that it's been computed at https://github.com/pupil-labs/pupil/blob/master/pupil_src/shared_modules/calibration_routines/gaze_mappers.py#L571

After performing screen marker calibration, the projected point (2D gaze in world-camera image plane) looks good - but the 3D gaze point Z-value is way off, compared to real distance.

I tried to re-calibrate, and test with different distances [0.5, 3]m - still have weird Z-values.

Can you please elaborate on the expected accuracy of the 3D intersection point?

Thanks in advance

marc-tonsen commented 5 years ago

Hi @ABob53 !

We have not yet explicitly evaluated this ourselves, but there is a recent paper from ETRA '18 that did such an evaluation.

Generally predicting depth is difficult, because it has to be inferred from binocular vergence (is your headset binocular? With a monocular one, depth estimation is not possible.). I.e. the wider the angle of binocular vergence, the closer the point of regard is to the viewer. The problem with this is that for distances larger than 1 meter, the angular difference in vergence is very small (below 1 degree very quickly) and thus depth estimation becomes very sensitive to noise.

Best, Marc

ABob53 commented 5 years ago

Hi Marc,

Thanks for your reply. Yes, I'm using the binocular headset (with the Fisheye camera).

This paper compares between two methods for depth estimation. Both methods are not identical to Pupil Labs method (the geometric one is very similar, but different because they have RGBD data and they estimate the eye-ball centers as a part of the "per frame" optimization).

Anyways, looking at the 3D gaze point calculated in Pupil Labs code, I got a very large error ( 0.8m error at 1m distance). I was wondering, how it's possible that the 3D point is totally wrong but the projected 2D gaze looks good? Perhaps the optimization problem (BA) you solve in the calibration step leads to this phenomena?

Thanks

marc-tonsen commented 5 years ago

Hi @ABob53 !

You are right, the geometric method is not exactly what is happening in Pupil. I was not aware of that, from the ETRA presentation I understood they were comparing to our exact setup, my mistake!

I can't tell you the exact source of this phenomenon, as we have never evaluated this in detail, but I also think it is most likely introduced in the bundle adjustment. Our 3D model makes a few simplifying assumptions, e.g. the refractive properties of the cornea are not considered. Errors that are due to such inaccurate assumptions are compensated in the bundle adjustment, which is only optimizing for 2D gaze prediction. Thus we end up with a model that does accurate 2D gaze estimation, but is somewhat "wrong" in other regards. We are actively working on improving the realism of the 3D model, but there is currently no ETA for any specific improvements.

ABob53 commented 5 years ago

I understand ! Thanks for your detailed explanation.

Thinking about alternative solution, does Pupil uses the depth data (from RealSense) to get a better 3D point estimation, or not? what is the main usage of RealSense depth data in Pupil algorithm?

papr commented 5 years ago

The Pupil pipeline does not rely on the realsense depth data since it is not available in many cases.

peteratBHVI commented 5 years ago

Hi ABob53, hi marc, hi Papr,

I measurement focus point distances via 3D model to quantify the "work load" of the eye. Near work is considered as higher effort for the eyes due to focus length change of the eye(accomodation) and convergence of the eye. Usually that has as well effects on the pupil btw.

I have similar experience, the z values of the 3D model is inconsistent with distances I looked at. The interesting things in my test are, that sometimes the measurements are accurate, but most of the time they are not. I have a particular sample, that got it right after a calibration, I want to share with you. If we understand that one, we may get better results. In the export folder 000 there are some diagrams attached to the measurement. confidence distribution of confidence - histogram binocular focus distance - velocity based fixation detector without outlier detection pupil size and eye rotation center distance (ERCD) to pupil distance(PD) right and left video of both eyes with confidence data and image number.

after the calibration 20s the data is almost right. I looked at 600mm 1800 mm and 3000. The higher distances are more off but the closer once are relative ok. Interesting after the calibration the ERCD is about 60 mm and PD are closer hence the focus point distance are more accurate.
!!My PD for distance is 66 mm and it is believed that the ERCD should be around that value as well. !! If you need further explanations to any diagrams or methods, please let me know. Or if you find something interesting in the data. ...

PS. Focus distance diagram: The long outliers to the top of the diagram are misleading. These are data before and after blink events. Not sure if that belongs to eye movements or detection discrepancies.

@ABob53 If after cleaning your data, if you correct the measured max PD to the value of the max ERCD and all other PD value accordingly your should get fairly good results for depth measurements.

Looking forward to your ideas. Cheers Peter

https://www.dropbox.com/sh/cae7tewsjmjemwm/AABOouijde0xwJPAZkTWLSura?dl=0

marc-tonsen commented 5 years ago

Hi @peteratBHVI !

Thanks a lot for sharing your recording! I'll check it out and make sure to forward it to the guys working on the problem!

peteratBHVI commented 5 years ago

@ABob53 Hi ABob53, with little tricks you can still gain reasonable good focus distance data. It requires filtering of invalid and inconsistent data and adjustment of the pupil distance to be more equal to the eye rotation center distance. out of that the distance can be calculated. ;)

10 focus distance correction applied

it matched my experiment fairly well. Cheers Peter

ABob53 commented 5 years ago

Hi @peteratBHVI ,

Sorry for the late reply, I was in vacation.

Thanks for sharing your data and conclusion, really helpful. It looks like most of the data (blue) not valid, noisy. It's also weird that after tricks and filtering it gives good estimate. I was expecting either all samples invalid (because of the 2D fitting/optimization), or that most of them are good with some outliers. what do you think @peteratBHVI @marc-tonsen ?

Anyways, thanks again for your comment and if you can please share some info about your "tricks" for filtering it would be great.

euryalus commented 5 years ago

Hi @ABob53,

from a private exchange with @peteratBHVI, I understood he is not simply filtering the data. Instead, he is applying an explicit correction function to the depth estimates produced by the Pupil pipeline. That leads to the improved depth prediction accuracy. Maybe @peteratBHVI can comment on that.

Best,

Kai

peteratBHVI commented 5 years ago

Hi ABob53, that's right I apply a correction. Fist of all I experienced that the relative depth data is in most of the data sets. Due to the small angels the accuracy of 10-20% is reasonable. Closer distances are more accurate, further a bit less.

I applied a velocity based fixation filter monocular and matched it binocular.
Filtering. Pre and post blink event data was found to be misleading and I had to disregard the data. Reason could be a retraction of the eye ball into the orbita during blinking. That was new to me, but Riggs and Evinger described experimentally measured retractions with an amplitude of 1-1.5mm. (Riggs Blink related eye movements 1987 etc. )

after that I looked at the trigonometrical function of distance between the sphere centers and the pupil centers and manipulated them until I 'liked' the data for he depth measurement. ;) If you have from time to time a known fixation distance, that could be a marker for you.

I still struggle with longer recordings and finding the correction automatically. But it's a start. For further development of depth measurements algorithms I guess it is key is to identify the retraction of the eye. Does that exist for monocular eye tracking devices? That would be interesting to apply. Vice versa a retraction could indicate a blink twitch or squint.

Cheers Peter

ABob53 commented 5 years ago

Hi @peteratBHVI

Nice work. Thanks for sharing your conclusions. Yes, It's expected that changing the eyeball radius will affect the prediction (especially in the z-coordinate, because it's a scaling related).

I think it's a very interesting problem to investigate - will try to find the time to do so, and will let you know If I got any progress.

Best

peteratBHVI commented 5 years ago

Hi @ABob53

The longitudinal movement of the eye(z-direction) associated with blink is described by https://www.ncbi.nlm.nih.gov/pubmed/8591916 Riggs https://www.ncbi.nlm.nih.gov/pubmed/6481436 Evinger

Just to clarify. Changing eye ball radius is maybe one way to do a correction. I left the distance between the eye rotation centers right and left and the eye ball radius as they are. I changed the pupil distance, the distance between the pupil centers right and left until they are suitable in a reference distances. From there the testing distance had a relative good accuracy.

Please keep me posted!!! Cheers Peter

jhs0053 commented 5 years ago

@peteratBHVI Could you let me know where the variable related to pupil distance is?

peteratBHVI commented 5 years ago

@jhs0053 sorry for the delay, I was on vacations. Out of the 3D eye model and the Gaze_positions you can calculate the pupil distance. all dimensions are in the world camera coordinate system. the position of a pupil center is calculated eye center vector(X Y Z) plus 12(pupil labs constant) * gaze normal vector(X Y Z). difference eye0 to eye1 is then the pupil distance. There are a few assumptions within the 3D model to make it happen, influencing the accuracy of the eye center and the pupil center data.

alirezad commented 5 years ago

@peteratBHVI Thanks for sharing your results!

Could you comment on the robustness of your distance estimation? E.g. the following paper mentions that evoking pupil constriction/dilation by changing illumination throws off vergence estimation. https://www.sciencedirect.com/science/article/pii/S0042698919300070 I also wonder if the accuracy decays over time, or if it gets thrown off if the headset moves a bit or if it is taken off and put back on.

I appreciate any pointers. I need to estimate distances up to 2m and I am trying to decide if the pupil labs headset would cut it for me before I purchase them.

zhimin-wang commented 4 years ago

@peteratBHVI Hello, you obtain a good results for depth measurements through modifying the eye rotation center distance(ERCD) and pupil distance(PD). I understand the 3D eyeball model that is built from multiple observations of projected pupil contours. The position of pupil center is also calculated referring to the eye ball center. But how can we modify the ERCD and PD? Can you share more examples or implementation to make it more clear? Looking forward to your reply. Thank you.

peteratBHVI commented 4 years ago

hi @962557346,

I just do post processing of exported pupil positions. At a given time in my recordings I know the object distance and from there I adjust the PD to adjust all other data. This is quit volatile and changes during long-term recordings. Calculations are from the ERC, considering the norm vector of gaze directions times 10.39 as a standard from PL (from memory). from there with the given object distance you can apply a correction to the pupil distance. Please let me know if you have any further question. cheers Peter

peteratBHVI commented 4 years ago

@peteratBHVI Thanks for sharing your results!

Could you comment on the robustness of your distance estimation? E.g. the following paper mentions that evoking pupil constriction/dilation by changing illumination throws off vergence estimation. https://www.sciencedirect.com/science/article/pii/S0042698919300070 I also wonder if the accuracy decays over time, or if it gets thrown off if the headset moves a bit or if it is taken off and put back on.

I appreciate any pointers. I need to estimate distances up to 2m and I am trying to decide if the pupil labs headset would cut it for me before I purchase them.

My apologizes, I just have seen that comment now. Vergence measurement are very fragile and the process I used need to be verified frequently. I would say at least every couple of minutes. I do not belief that any eye-tracker can measure reliably.

The effect that Hooges et al. describes can be attributed to the pupil center shift relative to cornea center during dilation change. (fig 3). maybe interesting read: DOI: 10.13140/RG.2.2.11249.02403 Evinger C, Shaw M, Peck C, Manning K, Baker R. Blinking and associated eye movements in humans, guinea pigs, and rabbits. J Neurophysiol. 1984

Cheers Peter

We moved away from that process and used a time of flight camera to measure distances at gaze direction.