Open gsaurabhr opened 5 months ago
SLEEP EDF SC files: 7 channels ST files: 5 channels
S.No. |
Dataset | No. of subjects | No. of EEG channels | Duration of recording | Auxillary channels | Expert annotations (Y/N) | Ease of availability | Comments | Link |
---|---|---|---|---|---|---|---|---|---|
1 | sleep-edf expanded database | 153 + 44 = 197 | 2 | 20 hours (2 nights)9 hours (2 nights) | EOG, EMG, respiration, chin EMG, Body temperature | Yes | Downloadable | https://www.physionet.org/content/sleep-edfx/1.0.0/sleep-cassette/#files-panel | |
2 | ISRUC-Sleep Dataset | 100, 8, 10 | 6 | ~8 hours | EOG, chin EMG, ECG, Leg EMG, Snore, AIrflow, Abdominal effort, Pulse oximetry, Body position | Yes | Available, not able to download | For 100 subjects, most have sleep apnea (one session). For 8 subjects, with sleep disorders (2 sessions on different dates). For 10 subjects, healthy group (one session) | https://sleeptight.isr.uc.pt/?page_id=48 |
3 | Dreem Sleep Stage Classification Challenge | 7 | Pulse oximetry, accelemoter | Yes | Not available | https://www.kaggle.com/c/dreem-sleep-stages/data | |||
4 | Sleep Disorders Research Center (SDRC) | 60 | 14 | 8 hours | 6 EOG and 3 EMG channels | Yes | Downloadable | Power spectral features for each frequency band is provided | https://data.mendeley.com/datasets/3hx58k232n/4 |
5 | Massachusetts General Hospital’s (MGH) | 1983 | 6 | 7.7 hours | EOG, EMG, EKG, respiration signals, and oxygen saturation (SaO2) | Yes | Downloadable | Most subjects have sleep disorders, and most files are in .mat format | https://physionet.org/content/challenge-2018/1.0.0/ |
6 | Dreem-automated-sleep-staging | 9 | 5 | ~7 hours | 3 Accelerometers | Yes | Available in .npy, .json format | Data is for 9 nights (6 in training and 3 in test set) | https://www.kaggle.com/competitions/dreem-automated-sleep-staging/data |
7 | OSF Nap EEG | 20 | 62 | Each session (task + nap) was of 2 hours, nap was 30 min or 60 min | 2 EOG channels | Yes (sleep stages as well as spindles at 30s intervals) - single rater | Data available in .eeg, .vmrk and .vhdr format | Data obtained during naps taken by healthy adult participants after performance of a visual working memory task. Each participant took part in two recording sessions during which each completed a high- or low-load scene working memory task followed by a 30 or 60-minute nap on a bed inside a sound-attenuated recording chamber | https://osf.io/sqg4m , https://osf.io/chav7/ , https://osf.io/ebvsr |
https://osf.io/chav7/wiki/home/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998176/
I think this will be good. This dataset contains 64-channel recordings. They have manual annotations by experts for sleep stage (and spindles, that we do not need). They also seem to cover a good number of sleep stages in their recording window.
One possibility is that Tanvi can work with this with 3D CNNs (or other approaches that explicitly take into account spatio-temporal patterns). Ayush can use sleep-EDF and/or other datasets for the transformer based approach.
For this data, here are the next steps:
Complete the visualization as mentioned in the visualization issue Additional plot: for each sleep stage, how many subjects go through that stage. Try to visualize all hypnograms (sleep stage vs time) at a time (think about good ways to visualize that information). What I want is to get a sense of how many transitions we can see across different pairs of sleep stages, so that we have some idea about the coverage of the dataset. I think this dataset is sufficient, and there are not many other HD EEG datasets available. But if you want, you can spend some time trying to find alternate datasets also.
Below is a list of datasets. Add more as you find them if needed. Create a new comment for each dataset and any notes about the data can be added to those.
You can consider factors such as:
Maybe make a table with these parameters for prominent datasets below. Based on that we can decide which ones to go for.