[WIP] Add braindecode models

gemeinl commented 3 years ago

This is a first shot at adding braindecode models to the benchmark. It features

reading of epoched .fif files in a lazy fashion
conversion of data to braindecode compatible format
creation of Shallow and Deep model
running a 10-fold CV

Code is quite dirty, I am looking into ways for improvement (e.g. skorch by itself can theoretically be used with sklearn functions like cross_val_score, however, our objects on top of it (the EEGRegressor) are incompatible at the moment).

gemeinl commented 3 years ago

@dengemann Would it be possible to make Robin (https://github.com/robintibor) a collaborator? I would like to discuss some potential braindecode improvements based on the code added here and it seems unnecessary to upload the code to another location to do so.

dengemann commented 3 years ago

@gemeinl just added Robin - as I have mentioned, I am very happy to include Robin in this project.

dengemann commented 3 years ago

@gemeinl thank you for the work! I had some delay here on my end but I'm about to move on; I will merge / integrate your PR once you're happy and once I've finished my benchark refactoring. We've very close to first set of results I think!

dengemann commented 3 years ago

Do you already have first impressions? I'm making some good progress in #11 - the idea would be then to add support for your benchmarks in the main script proposed in #11.

gemeinl commented 3 years ago

Do you already have first impressions? I'm making some good progress in #11 - the idea would be then to add support for your benchmarks in the main script proposed in #11.

I have seen the progress. It looks great! I do not have a properly set up environment / data converted to bids format to actually run the code on real data. I was developing on a handful of fake .fif files and ages. I will try to get conversion of TUH Abnormal done over the weekend.

dengemann commented 3 years ago

I have seen the progress. It looks great!

Cool thx!

I do not have a properly set up environment / data converted to bids format to actually run the code on real data. I was developing on a handful of fake .fif files and ages. I will try to get conversion of TUH Abnormal done over the weekend.

OK, I see. I think we will have the handcrafted benchmarks by tonight/tomorrow and I hope tomorrow I can help @apmellot to merge #5 and implement the benchmark in #11. I think I want to start writing the first draft of the paper next week.

gemeinl commented 3 years ago

Hey guys, so I think I managed to put together some code to use braindecode models + datasets with scikit-learns cross_val_score function. It features

a custom scorer class (https://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10/files#diff-b08d5e2df402cedeae1725702e89a5e4d86e57e00a90e2107d854d9733ca08f9R422-R434) that can compute several metrics (mae and r2) on both compute window and recording level and thus avoids retraining the nets as many times as there are metrics,
an adapted KFold (https://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10/files#diff-b08d5e2df402cedeae1725702e89a5e4d86e57e00a90e2107d854d9733ca08f9R491-R510) that gets a 'list' of compute windows but creates splits on recording level, such that compute windows of a single recording do not end up in both train and valid set,
as well as a modified skorch SliceDataset (https://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10/files#diff-b08d5e2df402cedeae1725702e89a5e4d86e57e00a90e2107d854d9733ca08f9R387-R394) that reshapes the regression targets to 2 dimensions which is required by scikit-learn.

Additionally, the code depends on this braindecode PR https://github.com/braindecode/braindecode/pull/340 which aims to fix violations of scikit-learn conventions.

dengemann commented 3 years ago

Awesome @gemeinl this sounds very good! In the meantime we've been cleaning up further hidden issues in bids. MNE Python PR and an issue in the bids-pipeline which we're about to solve.

In the meantime, we've also improved the benchmark API, which now computes several metrics at once and gets timing information based on sklearn.model_selection.cross_validate https://github.com/dengemann/meeg-brain-age-benchmark-paper/blob/6cf9cf96aa3856e63a43d363013a5c6c406b7d78/compute_benchmark_age_prediction.py#L197

But I assume using sklearn.model_selection.cross_validate won't be an issue for you.

gemeinl commented 3 years ago

Is the data in volts or microvolts?

dengemann commented 3 years ago

Is the data in volts or microvolts?

MNE/fiff is stored in volts. If your code assumes mV you need to resscale.

gemeinl commented 3 years ago

I ran convert_tuh_to_bids.py, the bids pipeline with steps=preprocessing, and the compute_autoreject.py without any errors :) However, if I want to run the age prediction benchmark with dataset tuab and the braindecode models, I get an error due to unexpected and incorrect shapes. Instead of 21 channels, I get inconsistent numbers that are biggger than that (30, 36, ...). Can you confirm @dengemann ? Do you know what is happening / where I might be doing something wrong? Might this be related to https://github.com/dengemann/meeg-brain-age-benchmark-paper/issues/7 (the -REF also did not get stripped)?

dengemann commented 3 years ago

What script do you use for the benchmark? We did not delete the old ones.

From: gemeinl @.> Sent: Saturday, October 23, 2021 7:15:51 PM To: dengemann/meeg-brain-age-benchmark-paper @.> Cc: Denis A. Engemann @.>; Mention @.> Subject: Re: [dengemann/meeg-brain-age-benchmark-paper] [WIP] Add braindecode models (#10)

I ran convert_tuh_to_bids.py, the bids pipeline with steps=preprocessing, and the compute_autoreject.py without any errors :) However, if I want to run the age prediction benchmark with dataset tuab and the braindecode models, I get an error due to unexpected and incorrect shapes. Instead of 21 channels, I get inconsistent numbers that are biggger than that (30, 36, ...). Can you confirm @dengemannhttps://github.com/dengemann ? Do you know what is happening / where I might be doing something wrong? Might this be related to #7https://github.com/dengemann/meeg-brain-age-benchmark-paper/issues/7 (the -REF also did not get stripped)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10#issuecomment-950183604, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOR7CR63TI7D5ZHKHT77LDUILUUPANCNFSM5F2UKAJA.

gemeinl commented 3 years ago

I tried to run compute_benchmark_age_prediction.py as extended in this PR.

dengemann commented 3 years ago

yes the right one. let me check if I get that error. perhaps the latest changes broke something. but we computed the Riemann benchmark successfully last week,

From: gemeinl @.> Sent: Saturday, October 23, 2021 7:32:26 PM To: dengemann/meeg-brain-age-benchmark-paper @.> Cc: Denis A. Engemann @.>; Mention @.> Subject: Re: [dengemann/meeg-brain-age-benchmark-paper] [WIP] Add braindecode models (#10)

I tried to run compute_benchmark_age_prediction.py as extended in this PR.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10#issuecomment-950186010, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOR7CTKKGYONQAGZVUYA53UILWSVANCNFSM5F2UKAJA.

gemeinl commented 3 years ago

I currently added a hack in my code that does channel selection (https://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10/files#diff-b08d5e2df402cedeae1725702e89a5e4d86e57e00a90e2107d854d9733ca08f9R342-R355).

dengemann commented 3 years ago

weird should not be necessary as I took care of channel selection.

From: gemeinl @.> Sent: Saturday, October 23, 2021 7:47:05 PM To: dengemann/meeg-brain-age-benchmark-paper @.> Cc: Denis A. Engemann @.>; Mention @.> Subject: Re: [dengemann/meeg-brain-age-benchmark-paper] [WIP] Add braindecode models (#10)

I currently added a hack in my code that does channel selection (https://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10/files#diff-b08d5e2df402cedeae1725702e89a5e4d86e57e00a90e2107d854d9733ca08f9R342-R355).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/10#issuecomment-950187955, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOR7CTBRWIF7DWDY6DICUTUILYJTANCNFSM5F2UKAJA.

dengemann commented 3 years ago

@gemeinl I'm recomputing tuab from scratch to see if I get your error – based on #12

dengemann commented 3 years ago

@gemeinl what is the status on your side? Can we already try to run the models at Inria? @agramfort would be happy to help with that.

dengemann commented 3 years ago

@gemeinl btw #12 is mergeable in principle, just waiting for tuab results to also look meaningful.

gemeinl commented 3 years ago

Yes, code should run. It would be great if you could try to run it on your side for a first set of results. I am not yet happy with target scaling, so it is currently turned off. I would have to see how to properly do this. I feel the code is quite error-prone in some sections, so I want to invest some more time into making sure that stuff is correct. Getting first results in the expected range would definitely help :)

dengemann commented 3 years ago

@gemeinl sorry for confusion. I've merged now all PRs. Just open a new PR for further edits on your work.

meeg-ml-benchmarks / brain-age-benchmark-paper

[WIP] Add braindecode models #10