nbara / python-meegkit

🔧🧠 MEEGkit: MEG & EEG processing toolkit in Python
https://nbara.github.io/python-meegkit/
BSD 3-Clause "New" or "Revised" License
186 stars 51 forks source link

[ENH] RESS implem sklearn interface #62

Closed ludovicdmt closed 1 year ago

ludovicdmt commented 2 years ago

Hello,

I was recently asked to adapt your RESS implementation so it can be compatible with:

I basically used your implem and just simplified some parts of it: removed the fold/unfold/demean function to use vectorization from Numpy, use sklearn to compute empirical covariance matrices (can be easily adapted to use another covariance matrix estimation method with regul). I compared the two implementations on randomly generated data and they outputted the same results (max difference about $10${-16}$).

I also have updated accordingly the test_ress.py so it can run with the new data formatting and the transformation of RESS to a class.

nbara commented 2 years ago

Hello,

I was recently asked to adapt your RESS implementation so it can be compatible with:

  • MNE formatting (n_trials, n_chans, n_samples) (instead of (n_samples, n_chans, n_trials) in the current implem)

I'm a bit ambivalent about this TBH. The convention for most functions in meegkit is (n_samples, n_chans, n_trials).

It's true that MNE compatibility is nice, but this is a breaking change that would likely impact existing code for all users of the toolbox...

  • Sklearn interface: class inherited from BaseEstimator and TransformerMixin instead of function and fit and transform methods

I basically used your implem and just simplified some parts of it: removed the fold/unfold/demean function to use vectorization from Numpy, use sklearn to compute empirical covariance matrices (can be easily adapted to use another covariance matrix estimation method with regul). I compared the two implementations on randomly generated data and they outputted the same results (max difference about 10{-16}$).

👍👍

I also have updated accordingly the test_ress.py so it can run with the new data formatting and the transformation of RESS to a class.

👍

ludovicdmt commented 2 years ago

Regarding formatting (MNE or this other one) I don't have strong argument in favor. Here in ISAE (France, Toulouse) we use exclusively MNE for data loading/processing so this motivated the change but it is probably different elsewhere.

The only argument iwould be that is easier to iterate over epochs with the MNE formatting than with samples-first format, idk.

nbara commented 2 years ago

HI @ludovicdmt there are still conflicts in ress.py apparently.

Here in ISAE (France, Toulouse) we use exclusively MNE for data loading/processing so this motivated the change but it is probably different elsewhere.

MNE-style ordering (trials-first) makes sense for mostly offline analyses, but sample-first makes more sense when a) there are no trials (it's much easier to consider the last dimension as optional) b) for real-time applications when the data needs to be serialised.

ludovicdmt commented 2 years ago

Yep you are right in the buffer for my real time applications, I have data in 2D format. Though I artificially add a dimension (simulating single trial) to still be able to use some convenient MNE functions but that's probably not the most efficient way...

codecov[bot] commented 1 year ago

Codecov Report

Merging #62 (b5e7bb8) into master (501ef5c) will increase coverage by 0.05%. The diff coverage is 92.30%.

:exclamation: Current head b5e7bb8 differs from pull request most recent head bcebe66. Consider uploading reports for the commit bcebe66 to get more accurate results

@@            Coverage Diff             @@
##           master      #62      +/-   ##
==========================================
+ Coverage   78.82%   78.87%   +0.05%     
==========================================
  Files          22       22              
  Lines        2342     2367      +25     
==========================================
+ Hits         1846     1867      +21     
- Misses        496      500       +4     
Impacted Files Coverage Δ
meegkit/utils/denoise.py 76.00% <ø> (ø)
meegkit/ress.py 92.59% <92.15%> (-7.41%) :arrow_down:
meegkit/utils/sig.py 57.83% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

nbara commented 1 year ago

Hey @ludovicdmt finally found some time to look into this.

I reverted the MNE ordering, but kept the sklearn style interface

Merging this, thanks!