Closed hvini closed 7 months ago
Hey there @hvini, I reviewed the notebook and the codebase and jotting down some notes here. While Linear Regression (LR) with polynomial features is a good start, I think it might miss some key aspects of eye tracking data:
Regarding the features used for predictions (left_iris_x
and right_iris_x
for point_x
, and similar for point_y
), I think we could include more information like blink rate, head position, or even what the person might be reading (using another model, perhaps) to capture a more complete picture of eye movement and attention.
One more point I caught is about the data preparation. It looks like both training and test sets were pulled from the same file (_fixed_train_data.csv
), which raises a flag for potential data leakage. To get a proper evaluation, let's split the data more suitably.
I appreciate the solid foundation you've built, and would love to discuss more on this topic. Thanks!
Hello @LostSputnik,
The idea of polynomial features was to make the LR to "learn a curve", but the biggest problem is that the data do not have a linear behaviour, so, this approach did not helped.
About the outliers, we apply a filter on data collection to avoid these values.
I believe that the problem with same data being used as train and test is just a typo we left on code. a good solution can be to create a test dataset (from the frontend) for each number of points available, then, it should be added a logic on model evaluation, to choose the correspondent test data based on number of points selected on frontend.
The main idea of LR was to offset the iris position to the correspondent point on calibration page. since the tensorflow model returns only the iris position, the position will be always o center, so, we used the LR to try to move the point on center to the calibrated point, at top right, left, center, bottom...
Did you think there is more improvements that can be done to achieve the objective?
Ah that explains quite a bit! Thanks for elaborating on the approach. I ran the calibration on my end and was able to get this:
So from what I understand, the calibration for each specific point (top left, right, etc) is considered separately. Can you point me towards some documentation as to how these point clusters are processed or handled subsequently? If we can have a discussion going through the workflow/process, that would be immensely helpful as i'm still trying to understand how the real-time integration should be.
And regarding the limitations of the LR model, I can try exploring a number of alternatives to see how they perform. For the time being, can you please provide a a sample dataset (i.e a csv file) that i can experiment with until i have a better grasp of the codebase?
Thanks again @hvini
Hey @hvini,
I am trying to set up the project locally, but I'm facing an issue. Could you please help me out with it if I am missing something?
https://github.com/ruxailab/web-eye-tracker/assets/109458485/74a00669-ca5b-45f8-9b8c-cf9159b7193f
ok so I went through the frontend, linking back to the calib_validation
api and studied how the gaze_tracker.predict
works. I also read the generated csv files and the API responses, and I think i have some grasp on what's happening now.
First, let's get to this:
The main idea of LR was to offset the iris position to the correspondent point on calibration page
I now get what you're achieving by using the model here, but my question is, the offset positions that we are getting, they are basically the incorrect predictions of the model, are they not? From my theoretical understanding these points probably don't correlate much to the actual gaze positions as the model was trained to mimic constant valued targets. I am not sure if I am explaining myself properly here, so please let me know me if I'm wrong or ambiguous.
I thought about the improvements to this and this is the prototype idea that came to mind, based on the following two observations:
Combining these two, we can use a model to generate the gaze points in each cluster based on the movement patterns and the gaze directions. This might be a more accurate representation of the gaze points in each cluster.
We can also use this to get the heatmaps, as well as real-time gaze detection. Let me know what you think and if its aligned with the project objective. Thanks @hvini and @KarinePistili
hello @sushant4612,
i tried to start the project from zero and also had the same problem. it seems that one of the requirements dont works well on last version of python.
to solve this i updated the requirements to keep only the needed one. could you try to run again with new requirements?
Thanks for your response, @hvini. It's working now.
@LostSputnik
Yes. The points we get on result page is the LR predicted points. Indeed they dont correlate so much cause the collected points has almost the same position (unless you move your head abruptly), so, only with collected points position we cannot get good results.
Your idea seems good, maybe joining iris and gaze position could help the model differentiate each cluster and then we could have the heatmaps and realtime detections.
Thanks for the confirmation, @hvini! This gives us a solid foundation to explore new models and approaches. I'll dive into eye-tracking research and see how others have solved similar problems.
Now, instead of waiting to show the eye-tracking results after the calibration is finished, we want to see them plotted live during the calibration process, right? Here's a tentative breakdown of what i have in mind:
I'm eager to collaborate on these improvements and see how they improve our results! Let me know what you think.
Yes, thats it.
It seems a good approach. you can follow with this 👍
Thanks a bunch. I'll draft a proposal for GSoC based on this.
Review the linear regression implementation on
app/services/calib_validation/test.ipynb
file and create a document saying why it is (or not) a good choice for an eye tracker implementation