SVD did not converge in Linear Least Squares for 60Hz data with a lot of nan

Hello, I hope this message finds you well. I recently attempted to apply your methodology to our 60Hz dataset. Our dataset, which incorporates data from children, unfortunately contains a significant amount of NaN values.

During implementation, I encountered the following error: "LinAlgError: SVD did not converge in Linear Least Squares." Though I noticed that there are provisions for processing NaN values, I still wonder if these NaN values might be the root cause of the convergence issue.

To address this, I have ensured that the valid data points exceed a threshold of 5, but it didn't work. Additionally, I have attached the data file for your reference.

Any insights or suggestions you could provide to overcome this challenge would be greatly appreciated. Thank you for your attention to this matter.


Here are the parameters for remodnav, we used a 10.5-inch screen (16:9 with a resolution of 2560x1600), and the viewing distance is approximately 50 cm.

filtered_move = pd.read_csv('test.csv')
eye_clf = remodnav.clf.EyegazeClassifier(remodnav.clf.deg_per_pixel(0.2262,0.5,2560), 60, min_saccade_duration=0.1667)
eye_clf.preproc(filtered_move, savgol_length=0.05)
Hey, thanks for the detailed issue and the data! I tried reproducing the error you are seeing, but things seem to work for me. Here is what I did:

In [4]: filtered_move = pd.read_csv('test(1).csv')

In [5]: filtered_move
     Unnamed: 0           x           y      t
0             0         NaN         NaN  20912
1             1         NaN         NaN  20929
2             2         NaN         NaN  20945
3             3         NaN         NaN  20962
4             4         NaN         NaN  20979
..          ...         ...         ...    ...
567         567  631.547639  873.589568  30362
568         568  637.170350  853.189800  30379
569         569  640.885495  839.832655  30395
570         570  644.620697  827.325742  30412
571         571  648.052845  817.226048  30429

[572 rows x 4 columns]

In [6]: filtered_move[['x', 'y']]
              x           y
0           NaN         NaN
1           NaN         NaN
2           NaN         NaN
3           NaN         NaN
4           NaN         NaN
..          ...         ...
567  631.547639  873.589568
568  637.170350  853.189800
569  640.885495  839.832655
570  644.620697  827.325742
571  648.052845  817.226048

[572 rows x 2 columns]

In [7]: filtered_move[['x', 'y']].to_csv('tabsep_test.tsv', sep='\t', index=False, header=False)

Could you share the software versions of numpy, scipy, and statsmodels you have installed?

Thank you very much for your detailed reply. However, it still didn't work although I followed all the steps. Here is the version of my packages: numpy==1.23.5, scipy==1.9.1, statsmodels==0.14.0. I'll create a new environment with your software versions and try again.

Fortunately, I get the same output after upgrading these three packages. But I get the same warning just as yours, does it matter?

Glad that it works! The warning per se isn't necessarily a bad sign, its likely an internal attempt to divide by zero or nan (which should be handled in the code). But now that you have results, you should closely investigate whether they look plausible given your data and paradigm. You could, e.g., plot the results with the show_gaze function (see for an example). Keep in mind that this algorithm was not validated for data with low sampling rates, and we as authors have no experience with such sampling rates ourselves. :) Good luck!

