Closed dengemann closed 2 years ago
Ok, so code is not running yet? Let me know when you are free @hubertjb !
It’s running, Hubert found some issues with the tuab data. And I had to push a mission script. See my last commit to main.
From: gemeinl @.> Sent: Thursday, October 28, 2021 9:36:43 AM To: dengemann/meeg-brain-age-benchmark-paper @.> Cc: Denis A. Engemann @.>; Author @.> Subject: Re: [dengemann/meeg-brain-age-benchmark-paper] complete deep learning (shallow, deep) benchmarks (Issue #16)
Ok, so code is not running yet? Let me know when you are free @hubertjbhttps://github.com/hubertjb !
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dengemann/meeg-brain-age-benchmark-paper/issues/16#issuecomment-953580339, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOR7CQAMKMS33RCH7EA3GLUJEDRXANCNFSM5G2YWICA.
With that we can re-extract Tuab data. We forgot to take care of resampling to common frequency.
From: gemeinl @.> Sent: Thursday, October 28, 2021 9:36:43 AM To: dengemann/meeg-brain-age-benchmark-paper @.> Cc: Denis A. Engemann @.>; Author @.> Subject: Re: [dengemann/meeg-brain-age-benchmark-paper] complete deep learning (shallow, deep) benchmarks (Issue #16)
Ok, so code is not running yet? Let me know when you are free @hubertjbhttps://github.com/hubertjb !
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dengemann/meeg-brain-age-benchmark-paper/issues/16#issuecomment-953580339, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOR7CQAMKMS33RCH7EA3GLUJEDRXANCNFSM5G2YWICA.
@dengemann @gemeinl Quick update: I re-converted TUAB to BIDS to fix the number of channels issue (I had to save the data in the BrainVision format, but now we have all 21 channels), re-preprocessed with resampling at 200 Hz and finally applied autoreject. I did a quick test with ShallowNet and the model does train - I got performance values for a quick test with 10 recordings only. :)
Now running with the whole dataset, it's taking about 100 s per epoch, most of which is actually not using the GPU. Not sure why, but I'll wait for the training to finish so we have some results before digging deeper.
Of note, I had to set preload=True
when loading the epochs because I got an OSError: Too many files opened
. It looks we might be able to increase the limit (https://stackoverflow.com/questions/16526783/python-subprocess-too-many-open-files), but for now since the machine I'm using has a lot of RAM I'm using preload=True
.
First results in (using 2-fold CV for now): MAE(shallow, tuab) = 8.213297017716052 r2(shallow, tuab) = 0.5700137267570398
From looking at the figure posted above, this looks pretty similar to the filterbank-riemann model!
On the same two folds, for Deep4Net: MAE(deep, tuab) = 9.48504621480895 r2(deep, tuab) = 0.4640109871399135
I think the issue was caused by having a too large num_workers
, which likely created an IO bottleneck. Capped at n_gpus * 5 it's now much faster - it's taking ~35 s/epoch with ShallowNet with a 2-gpu setup.
See #22 for the changes. I launched the benchmark for shallow and deep on TUAB, will report the results tomorrow!
Thanks for the update @hubertjb. Great work! It is interesting that we get issues with too many open files, we "just" use ~1400 recordings. Cool that you figured out a way how to circumvent this!
It is a relief to know that the models seem to lie in the expected performance range. Maybe it would still be worth to check the learning curves. The number of training epochs was basically just a guess. The deep net probably requires some more time to fit. I'll figure out a way how to do it.
Btw. Any unfiltered impressions regarding adding learning curve benchmarks to the main results?
From: gemeinl @.> Sent: Friday, October 29, 2021 1:04:29 PM To: dengemann/meeg-brain-age-benchmark-paper @.> Cc: Denis A. Engemann @.>; Mention @.> Subject: Re: [dengemann/meeg-brain-age-benchmark-paper] complete deep learning (shallow, deep) benchmarks (Issue #16)
Thanks for the update @hubertjbhttps://github.com/hubertjb. Great work! It is interesting that we get issues with too many open files, we "just" use ~1400 recordings. Cool that you figured out a way how to circumvent this!
It is a relief to know that the models seem to lie in the expected performance range. Maybe it would still be worth to check the learning curves. The number of training epochs was basically just a guess. The deep net probably requires some more time to fit. I'll figure out a way how to do it.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/dengemann/meeg-brain-age-benchmark-paper/issues/16#issuecomment-954653459, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOR7CQMI5J4FGJ56CFPCHLUJKEU3ANCNFSM5G2YWICA.
It just finished training! Here are the mean results:
MAE r2 fit_time score_time
benchmark
deep 8.191346 0.562772 1626.901164 39.094545
shallow 7.901493 0.600152 1954.250558 43.815481
@dengemann I just pushed the two csvs to #22. (I tried to make the plots but it looks like there's a utils.r
script that's missing.)
It is a relief to know that the models seem to lie in the expected performance range. Maybe it would still be worth to check the learning curves. The number of training epochs was basically just a guess. The deep net probably requires some more time to fit. I'll figure out a way how to do it.
I agree, that's a good idea. I guess we would expect Deep4Net to perform better than ShallowNet for instance, which is not currently the case. Plotting the learning curves might help elucidate that.
@hubertjb pushed missing utils.r
@hubertjb does anything speak against running it on the other datsets?
Thanks @dengemann! However I think I'm having some trouble with the font :P
@hubertjb does anything speak against running it on the other datsets?
No, unless we would want to do some hyperparameter tuning (e.g. by looking at the training curves) before launching more compute. Which dataset should I start with?
I'd say lemon is next, then Cam-CAN, then chbp
However I think I'm having some trouble with the font :P
That's surprising, it should just be using a default font. Can you push edits on script to master [I guess you added 1 extra color]? I can run it on my side.
However I think I'm having some trouble with the font :P
That's surprising, it should just be using a default font. Can you push edits on script to master [I guess you added 1 extra color]? I can run it on my side.
I just pushed to changes to main @dengemann
I'd say lemon is next, then Cam-CAN, then chbp
I had to leave for 30 minutes so I started the benchmark on Cam-CAN before seeing your reply. It looks like the model stops learning after only 12 epochs, i.e. the training loss starts increasing (on the first fold at least). This makes me think we'll really need to do some dataset-specific hyperparameter tuning.
I'll launch it on LEMON just to see if that's the case too.
I couldn't get results on LEMON, I got the following error while trying to load the data:
FileNotFoundError: File does not exist: /storage/store3/derivatives/LEMON_EEG_BIDS/sub-010285/eeg/sub-010285_task-RSEEG_proc-autoreject_epo
Using 2-fold CV on Cam-CAN, I get:
MAE r2 fit_time score_time dataset benchmark
0 21.771525 -1.022423 1163.038959 87.999460 camcan deep
1 11.378344 0.397487 1195.394894 87.036303 camcan deep
I think this makes it clear we need to do some hyperparameter tuning. Also, I haven't looked at why yet, but ShallowNet returned nans here.
Same problem on CHBP as with LEMON:
FileNotFoundError: File does not exist: /storage/store3/derivatives/CHBMP_EEG_and_MRI/sub-CBM00179/eeg/sub-CBM00179_task-protmap_proc-autoreject_epo
@hubertjb can you do an ls -lrth /storage/store3/derivatives/CHBMP_EEG_and_MRI/sub-CBM00179/eeg/
to see what's going on there?
Also for the other cases. I think for 1-2 subjects things failed, you need to handle that by catching exceptions. For the other subjects all should be there.
latest plots after reconsidering priorities (put r2 as main figure, MAE as supplement) with updated color codes, better sizes and finer grid. I'll start a write-up based on this. Hopefully we can soon fill in the remaining blank boxes :) please keep me closely in the loop, such that I can help @hubertjb @gemeinl
@dengemann I'll try what you suggested above later today and will keep you updated.
About doing some hyperparameter tuning: do you have a preferred way of doing this? @dengemann @gemeinl I was thinking we could further divide the training set of each fold into a training and validation sets (and drop the test set), report performance for a little grid search on maybe {learning rate, batch size, dropout rate}, and finally pick the best configuration for each dataset.
After skipping failed files I got the models to run on both LEMON and CHBP @dengemann.
I started the work towards hyperparameter tuning in #24 if you want to take a look.
There is a couple of things I find noteworthy/unexpected:
Since all the preprocessing scripts seem to work robustly right now, I will also do some investigation on my end. Any thoughts @robintibor ?
latest results updated with LEMON benchmark (we're now done with the planned set of non-deep benchmarks):
I have looked at the different runtimes for deep and shallow. It is likely that the slower runtime for shallow arises from a very large last convolutional layer. This is caused by setting 'final_conv_length' to 'auto' and the trialwise decoding with trials of 10s at 200Hz. I will implement cropped decoding, as suggested in https://github.com/dengemann/meeg-brain-age-benchmark-paper/issues/25. It should decrease runtime for shallow and at the same time improve performance of deep. I will implement a flag, such that cropped or trialwise decoding can be another choice in the hyperparameter optimization https://github.com/dengemann/meeg-brain-age-benchmark-paper/pull/24
See #35 for the first complete set of results.
now that the benchmark script seems battle-tested, we still need to compute the results. In the figure below a few deep boxes are missing :)
I will take care of the missing handcrafted box.
The idea would be that @gemeinl and @hubertjb share a screen and fight / debug together with our Inria server.
I'm only one call / message away.