EtienneCmb / visbrain

A multi-purpose GPU-accelerated open-source suite for brain data visualization
http://visbrain.org
Other
241 stars 64 forks source link

Survey: best hypnogram format #72

Open raphaelvallat opened 4 years ago

raphaelvallat commented 4 years ago

Hi everyone,

This is not really an issue but rather a survey on what people think would be the best format to load and save sleep staging files (hypnogram). For a reminder, there are currently two main categories of formats supported by Visbrain, the stage-duration and point-per-second:

image

The point-per-second can be further subdivided into 1) the .hyp format (screenshot above) or 2) a .txt file with no header that MUST be accompanied by a _hypnodescription.txt file indicating the correspondence between the integer values and the sleep stage as well as the sampling frequency of the hypnogram.

I am not entirely satisfied with neither of these options. Some issues that I have are:

  1. The stage-duration format is not super practical because it needs to be converted back to a point-per-second vector in order to apply masking operations (e.g. detecting spindles only on N2 sleep), which I think is hard to do for users that have no or little programming experience.

  2. For the point-per-second, I think that 1) having a separate extension (.hyp) is not great because beginners may not realize that this is in fact simply a text file. I therefore much prefer a .csv or .txt extension; 2) however I don't like either the current text (.txt) format because it requires a _hypnodescription.txt file, i.e. 2 files instead of 1, which is cumbersome and may lead to error.

I think that one of my preferred format would be a single text file (.txt or .csv) extension that looks like:

# Date: Thu Mar 26 15:17:00 2020
# Number of values: 15
# Sampling frequency: 0.03333333333333333 Hz
# Resolution: 30.0 sec
# Duration (seconds): 450.0
# Stage Uns: -2
# Stage Art: -1
# Stage W: 0
# Stage N1: 1
# Stage N2: 2
# Stage N3: 3
# Stage REM: 4
0
1
2
-1

A loading function could then separately read the header and the values to construct a final point-per-sec hypnogram.

An alternative that I've seen in some sleep scoring softwares could be a .csv files with several columns to indicate 1) the sleep stage (in string format, e.g. "N1", "N2), 2) the epoch number and 3) the time at the start of the epoch (e.g. 22:10:30), respectively. However, this requires having the start time of the recording, which is not always known, especially when working with a NumPy array.

Epoch Time Stage
0 22:00:00 W
1 22:00:30 W
2 22:01:00 N1
3 22:01:30 N2

What do people think? To be clear, I am not saying we should completely change the hypnogram format in Visbrain, but just trying to think of what could be the most convenient format for most users.

Thanks! Raphael

skjerns commented 4 years ago

On the one side, format 1 has many advantages and is effective, but cumbersome (had to deal with this myself at times). On the other side, most programs work with a format similar to 2, and sleep researchers are quite familiar with it. As we are still mostly scoring in equal-spaced epochs (ie always 30 seconds), time based annotations do not add much yet.

I'd vote for a format 2 with # annotations as .txt. Dropping the description file will help to unclutter the folders. The multi-column would just increase human-readability, as the epoch and time information is redundant with the information in the #-header, so I don't think it's necessary.

Maybe there could be a header part about scoring procedure (created by visbrain v1.x at %datetime%), in future this could be used to add information about automatic scoring algos.

TomBugnon commented 3 years ago

I (and @grahamfindlay ) have a strong preference for the stage-duration format