psychoinformatics-de / remodnav

Robust Eye Movement Detection for Natural Viewing
Other
59 stars 16 forks source link

SVD did not converge in Linear Least Squares for 60Hz data with a lot of nan #52

Closed dishangti closed 6 months ago

dishangti commented 6 months ago

Hello, I hope this message finds you well. I recently attempted to apply your methodology to our 60Hz dataset. Our dataset, which incorporates data from children, unfortunately contains a significant amount of NaN values.

During implementation, I encountered the following error: "LinAlgError: SVD did not converge in Linear Least Squares." Though I noticed that there are provisions for processing NaN values, I still wonder if these NaN values might be the root cause of the convergence issue.

To address this, I have ensured that the valid data points exceed a threshold of 5, but it didn't work. Additionally, I have attached the data file for your reference.

Any insights or suggestions you could provide to overcome this challenge would be greatly appreciated. Thank you for your attention to this matter.

test.csv


`{
    "name": "LinAlgError",
    "message": "SVD did not converge in Linear Least Squares",
    "stack": "---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
d:\\dst\\Repos\\ASD-EyeTrack-RGB\\stats\\Untitled.ipynb Cell 8 line 2
     <a href='vscode-notebook-cell:/d%3A/dst/Repos/ASD-EyeTrack-RGB/stats/Untitled.ipynb#X10sZmlsZQ%3D%3D?line=24'>25</a> filtered_move = pd.DataFrame(eye_move, columns=['x', 'y', 't'])
     <a href='vscode-notebook-cell:/d%3A/dst/Repos/ASD-EyeTrack-RGB/stats/Untitled.ipynb#X10sZmlsZQ%3D%3D?line=25'>26</a> filtered_move.to_csv('test.csv')
---> <a href='vscode-notebook-cell:/d%3A/dst/Repos/ASD-EyeTrack-RGB/stats/Untitled.ipynb#X10sZmlsZQ%3D%3D?line=26'>27</a> eye_clf.preproc(filtered_move, savgol_length=0.05)

File d:\\ProgramData\\Anaconda3\\envs\\torch\\lib\\site-packages\\remodnav\\clf.py:863, in EyegazeClassifier.preproc(self, data, min_blink_duration, dilate_nan, median_filter_length, savgol_length, savgol_polyord, max_vel)
    859     lgr.info(
    860         'Smooth coordinates with Savitzy-Golay filter (len=%i, ord=%i)',
    861         savgol_length, savgol_polyord)
    862     for i in ('x', 'y'):
--> 863         data[i] = savgol_filter(data[i], savgol_length, savgol_polyord)
    865 # velocity calculation, exclude velocities over `max_vel`
    866 # no entry for first datapoint!
    867 velocities = self._get_velocities(data)

File ~\\AppData\\Roaming\\Python\\Python38\\site-packages\\scipy\\signal\\_savitzky_golay.py:351, in savgol_filter(x, window_length, polyorder, deriv, delta, axis, mode, cval)
    347     # Do not pad. Instead, for the elements within `window_length // 2`
    348     # of the ends of the sequence, use the polynomial that is fitted to
    349     # the last `window_length` elements.
    350     y = convolve1d(x, coeffs, axis=axis, mode=\"constant\")
--> 351     _fit_edges_polyfit(x, window_length, polyorder, deriv, delta, axis, y)
    352 else:
    353     # Any mode other than 'interp' is passed on to ndimage.convolve1d.
    354     y = convolve1d(x, coeffs, axis=axis, mode=mode, cval=cval)

File ~\\AppData\\Roaming\\Python\\Python38\\site-packages\\scipy\\signal\\_savitzky_golay.py:223, in _fit_edges_polyfit(x, window_length, polyorder, deriv, delta, axis, y)
    216 \"\"\"
    217 Use polynomial interpolation of x at the low and high ends of the axis
    218 to fill in the halflen values in y.
    219 
    220 This function just calls _fit_edge twice, once for each end of the axis.
    221 \"\"\"
    222 halflen = window_length // 2
--> 223 _fit_edge(x, 0, window_length, 0, halflen, axis,
    224           polyorder, deriv, delta, y)
    225 n = x.shape[axis]
    226 _fit_edge(x, n - window_length, n, n - halflen, n, axis,
    227           polyorder, deriv, delta, y)

File ~\\AppData\\Roaming\\Python\\Python38\\site-packages\\scipy\\signal\\_savitzky_golay.py:193, in _fit_edge(x, window_start, window_stop, interp_start, interp_stop, axis, polyorder, deriv, delta, y)
    189 xx_edge = xx_edge.reshape(xx_edge.shape[0], -1)
    191 # Fit the edges.  poly_coeffs has shape (polyorder + 1, -1),
    192 # where '-1' is the same as in xx_edge.
--> 193 poly_coeffs = np.polyfit(np.arange(0, window_stop - window_start),
    194                          xx_edge, polyorder)
    196 if deriv > 0:
    197     poly_coeffs = _polyder(poly_coeffs, deriv)

File <__array_function__ internals>:180, in polyfit(*args, **kwargs)

File d:\\ProgramData\\Anaconda3\\envs\\torch\\lib\\site-packages\
umpy\\lib\\polynomial.py:668, in polyfit(x, y, deg, rcond, full, w, cov)
    666 scale = NX.sqrt((lhs*lhs).sum(axis=0))
    667 lhs /= scale
--> 668 c, resids, rank, s = lstsq(lhs, rhs, rcond)
    669 c = (c.T/scale).T  # broadcast scale coefficients
    671 # warn on rank reduction, which indicates an ill conditioned matrix

File <__array_function__ internals>:180, in lstsq(*args, **kwargs)

File d:\\ProgramData\\Anaconda3\\envs\\torch\\lib\\site-packages\
umpy\\linalg\\linalg.py:2300, in lstsq(a, b, rcond)
   2297 if n_rhs == 0:
   2298     # lapack can't handle n_rhs = 0 - so allocate the array one larger in that axis
   2299     b = zeros(b.shape[:-2] + (m, n_rhs + 1), dtype=b.dtype)
-> 2300 x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
   2301 if m == 0:
   2302     x[...] = 0

File d:\\ProgramData\\Anaconda3\\envs\\torch\\lib\\site-packages\
umpy\\linalg\\linalg.py:101, in _raise_linalgerror_lstsq(err, flag)
    100 def _raise_linalgerror_lstsq(err, flag):
--> 101     raise LinAlgError(\"SVD did not converge in Linear Least Squares\")

LinAlgError: SVD did not converge in Linear Least Squares"
}
dishangti commented 6 months ago

Here are the parameters for remodnav, we used a 10.5-inch screen (16:9 with a resolution of 2560x1600), and the viewing distance is approximately 50 cm.


filtered_move = pd.read_csv('test.csv')
eye_clf = remodnav.clf.EyegazeClassifier(remodnav.clf.deg_per_pixel(0.2262,0.5,2560), 60, min_saccade_duration=0.1667)
eye_clf.preproc(filtered_move, savgol_length=0.05)
adswa commented 6 months ago

Hey, thanks for the detailed issue and the data! I tried reproducing the error you are seeing, but things seem to work for me. Here is what I did:

In [4]: filtered_move = pd.read_csv('test(1).csv')

In [5]: filtered_move
Out[5]: 
     Unnamed: 0           x           y      t
0             0         NaN         NaN  20912
1             1         NaN         NaN  20929
2             2         NaN         NaN  20945
3             3         NaN         NaN  20962
4             4         NaN         NaN  20979
..          ...         ...         ...    ...
567         567  631.547639  873.589568  30362
568         568  637.170350  853.189800  30379
569         569  640.885495  839.832655  30395
570         570  644.620697  827.325742  30412
571         571  648.052845  817.226048  30429

[572 rows x 4 columns]

In [6]: filtered_move[['x', 'y']]
Out[6]: 
              x           y
0           NaN         NaN
1           NaN         NaN
2           NaN         NaN
3           NaN         NaN
4           NaN         NaN
..          ...         ...
567  631.547639  873.589568
568  637.170350  853.189800
569  640.885495  839.832655
570  644.620697  827.325742
571  648.052845  817.226048

[572 rows x 2 columns]

In [7]: filtered_move[['x', 'y']].to_csv('tabsep_test.tsv', sep='\t', index=False, header=False)

Could you share the software versions of numpy, scipy, and statsmodels you have installed?

Software versions I have: ``` ❱ pip freeze annexremote==1.6.0 asttokens==2.4.1 bleach==6.0.0 boto==2.49.0 certifi==2022.12.7 cffi==1.15.1 chardet==5.1.0 charset-normalizer==3.1.0 contourpy==1.0.7 coverage==7.2.4 cryptography==40.0.2 cycler==0.11.0 datalad==0.18.3 decorator==5.1.1 distro==1.8.0 docutils==0.19 executing==2.0.1 fasteners==0.18 fonttools==4.39.3 humanize==4.6.0 idna==3.4 importlib-metadata==6.6.0 iniconfig==2.0.0 ipython==8.22.1 iso8601==1.1.0 jaraco.classes==3.2.3 jedi==0.19.1 jeepney==0.8.0 keyring==23.13.1 keyrings.alt==4.2.0 kiwisolver==1.4.4 looseversion==1.1.2 markdown-it-py==2.2.0 matplotlib==3.7.1 matplotlib-inline==0.1.6 mdurl==0.1.2 more-itertools==9.1.0 msgpack==1.0.5 numpy==1.24.3 packaging==23.1 pandas==2.0.1 pandoc==2.3 parso==0.8.3 patool==1.12 patsy==0.5.3 pexpect==4.9.0 Pillow==9.5.0 pkginfo==1.9.6 platformdirs==3.5.0 pluggy==1.0.0 plumbum==1.8.1 ply==3.11 prompt-toolkit==3.0.43 ptyprocess==0.7.0 pure-eval==0.2.2 pycparser==2.21 Pygments==2.15.1 pyparsing==3.0.9 pytest==7.3.1 pytest-cov==2.5.1 python-dateutil==2.8.2 python-gitlab==3.14.0 pytz==2023.3 readme-renderer==37.3 remodnav==1.1.2 requests==2.29.0 requests-toolbelt==0.10.1 rfc3986==2.0.0 rich==13.3.5 scipy==1.10.1 SecretStorage==3.3.3 six==1.16.0 stack-data==0.6.3 statsmodels==0.13.5 tqdm==4.65.0 traitlets==5.14.1 twine==4.0.2 tzdata==2023.3 urllib3==1.26.15 wcwidth==0.2.13 webencodings==0.5.1 zipp==3.15.0 ```
dishangti commented 6 months ago

Thank you very much for your detailed reply. However, it still didn't work although I followed all the steps. Here is the version of my packages: numpy==1.23.5, scipy==1.9.1, statsmodels==0.14.0. I'll create a new environment with your software versions and try again.

dishangti commented 6 months ago

Fortunately, I get the same output after upgrading these three packages. But I get the same warning just as yours, does it matter?

adswa commented 6 months ago

Glad that it works! The warning per se isn't necessarily a bad sign, its likely an internal attempt to divide by zero or nan (which should be handled in the code). But now that you have results, you should closely investigate whether they look plausible given your data and paradigm. You could, e.g., plot the results with the show_gaze function (see https://github.com/psychoinformatics-de/remodnav/issues/28#issuecomment-913737832 for an example). Keep in mind that this algorithm was not validated for data with low sampling rates, and we as authors have no experience with such sampling rates ourselves. :) Good luck!

I'm closing this issue as resolved, but please feel free to reopen it or open a new one if you disagree. :)