ruxailab / web-eye-tracker

MIT License
3 stars 17 forks source link

Review linear regression implementation #11

Closed hvini closed 3 months ago

hvini commented 3 months ago

Review the linear regression implementation on app/services/calib_validation/test.ipynb file and create a document saying why it is (or not) a good choice for an eye tracker implementation

LostSputnik commented 3 months ago

Hey there @hvini, I reviewed the notebook and the codebase and jotting down some notes here. While Linear Regression (LR) with polynomial features is a good start, I think it might miss some key aspects of eye tracking data:

  1. Temporal dynamics: Linear regression doesn't account for the temporal nature of eye tracking data, missing out on how past positions can influence future ones. This is where models like RNNs or LSTMs, designed for time series data, might be worth exploring..
  2. Errors and outliers: LR works on the assumption that errors are normally distributed and consistent across all data points. Yet, in eye tracking, you often run into outliers (like sudden blinks or quick movements) that mess with this assumption. This could lead the model to make biased or inaccurate predictions.

Regarding the features used for predictions (left_iris_x and right_iris_x for point_x, and similar for point_y), I think we could include more information like blink rate, head position, or even what the person might be reading (using another model, perhaps) to capture a more complete picture of eye movement and attention.

One more point I caught is about the data preparation. It looks like both training and test sets were pulled from the same file (_fixed_train_data.csv), which raises a flag for potential data leakage. To get a proper evaluation, let's split the data more suitably.

I appreciate the solid foundation you've built, and would love to discuss more on this topic. Thanks!

Best regards, Tahsen Islam Sajon Resume | Email | LinkedIn

hvini commented 3 months ago

Hello @LostSputnik,

The idea of polynomial features was to make the LR to "learn a curve", but the biggest problem is that the data do not have a linear behaviour, so, this approach did not helped.

About the outliers, we apply a filter on data collection to avoid these values.

I believe that the problem with same data being used as train and test is just a typo we left on code. a good solution can be to create a test dataset (from the frontend) for each number of points available, then, it should be added a logic on model evaluation, to choose the correspondent test data based on number of points selected on frontend.

The main idea of LR was to offset the iris position to the correspondent point on calibration page. since the tensorflow model returns only the iris position, the position will be always o center, so, we used the LR to try to move the point on center to the calibrated point, at top right, left, center, bottom...

Did you think there is more improvements that can be done to achieve the objective?

LostSputnik commented 3 months ago

Ah that explains quite a bit! Thanks for elaborating on the approach. I ran the calibration on my end and was able to get this:

Screenshot from 2024-03-12 15-20-55

So from what I understand, the calibration for each specific point (top left, right, etc) is considered separately. Can you point me towards some documentation as to how these point clusters are processed or handled subsequently? If we can have a discussion going through the workflow/process, that would be immensely helpful as i'm still trying to understand how the real-time integration should be.

And regarding the limitations of the LR model, I can try exploring a number of alternatives to see how they perform. For the time being, can you please provide a a sample dataset (i.e a csv file) that i can experiment with until i have a better grasp of the codebase?

Thanks again @hvini

sushant4612 commented 3 months ago

Hey @hvini,

I am trying to set up the project locally, but I'm facing an issue. Could you please help me out with it if I am missing something?

LostSputnik commented 3 months ago

ok so I went through the frontend, linking back to the calib_validation api and studied how the gaze_tracker.predict works. I also read the generated csv files and the API responses, and I think i have some grasp on what's happening now.

First, let's get to this:

The main idea of LR was to offset the iris position to the correspondent point on calibration page

I now get what you're achieving by using the model here, but my question is, the offset positions that we are getting, they are basically the incorrect predictions of the model, are they not? From my theoretical understanding these points probably don't correlate much to the actual gaze positions as the model was trained to mimic constant valued targets. I am not sure if I am explaining myself properly here, so please let me know me if I'm wrong or ambiguous.

I thought about the improvements to this and this is the prototype idea that came to mind, based on the following two observations:

  1. Its true that the tensorflow model output always revolves around the center position since that's where the subject's head is, but there are iris movements in there.
  2. After some googling i found that it is possible to find the gaze direction, alongside the position of the pupils.

Combining these two, we can use a model to generate the gaze points in each cluster based on the movement patterns and the gaze directions. This might be a more accurate representation of the gaze points in each cluster.

We can also use this to get the heatmaps, as well as real-time gaze detection. Let me know what you think and if its aligned with the project objective. Thanks @hvini and @KarinePistili

hvini commented 3 months ago

hello @sushant4612,

i tried to start the project from zero and also had the same problem. it seems that one of the requirements dont works well on last version of python.

to solve this i updated the requirements to keep only the needed one. could you try to run again with new requirements?

sushant4612 commented 3 months ago

Thanks for your response, @hvini. It's working now.

hvini commented 3 months ago


Yes. The points we get on result page is the LR predicted points. Indeed they dont correlate so much cause the collected points has almost the same position (unless you move your head abruptly), so, only with collected points position we cannot get good results.

Your idea seems good, maybe joining iris and gaze position could help the model differentiate each cluster and then we could have the heatmaps and realtime detections.

LostSputnik commented 3 months ago

Thanks for the confirmation, @hvini! This gives us a solid foundation to explore new models and approaches. I'll dive into eye-tracking research and see how others have solved similar problems.

Now, instead of waiting to show the eye-tracking results after the calibration is finished, we want to see them plotted live during the calibration process, right? Here's a tentative breakdown of what i have in mind:

  1. Refine Feature Engineering: Combine the TensorFlow model's iris position output with derived gaze direction features.
  2. Experiment with Different Models: Explore models well-suited for complex, non-linear patterns and potentially temporal features.
  3. Data Validation & Retrain: Before retraining, it would be helpful to visualize and analyze the distribution of our new features to identify potential outliers or unexpected patterns.
  4. Real-time Visualization: Implement a way to display eye-tracking data (even a simple gaze point) during calibration, and other possible metrics.

I'm eager to collaborate on these improvements and see how they improve our results! Let me know what you think.

hvini commented 3 months ago

Yes, thats it.

It seems a good approach. you can follow with this 👍

LostSputnik commented 3 months ago

Thanks a bunch. I'll draft a proposal for GSoC based on this.