Closed hofaflo closed 7 months ago
Recap of our call today:
EdfBaseSignal
with subclasses EdfSignal
and EdfAnnotationsSignal
. The latter has a property annotations
instead of data
. An EdfAnnotationsSignal
does not allow to set certain header fields, e.g. label and physical dimension. An EdfSignal
cannot have the label EDF Annotations
.Edf.append_signals
, the new signal(s) should be appended after the last ordinary signal, to ensure an annotation signal at the last signal index stays there. This helps to reduce the indexing confusion introduced by annotation signals.We have not decided on the naming for properties allowing access to signals yet. There are at least two options:
ordinary_signals
(ordinary only) and signals
(ordinary+annotations)signals
(ordinary only) and all_signals
(ordinary+annotations)While the second one would expose the primary use case under the more obvious name, I prefer the first option, as it does not break the existing API. Also, as we have a method Edf.drop_signals
and intend to introduce Edf.insert_signal
, which both allow specifying signal indices, having a property signals
that produces different indices could easily lead to errors. Maybe @Teuniz or @cbrnr want to chime in?
I agree with marcoross. The annotation channel(s) shouldn't be exposed to the user. Just provide a list/collection of annotations.
In other words, I prefer the second option without all_signals
, thus like:
signals
(ordinary only)and no way to access the Annotation channel(s).
I can't think of a valid reason to provide direct access to the Annotations channel(s). It's also not a request I ever received.
@hofaflo can you explain the reasons why you would like to implement access to annotation channels like this? How are annotations currently handled? Are there any limitations that require this kind of raw access? I agree with both @marcoross and @Teuniz that edfio should not expose the raw annotation channels, but only regular signals plus a list of annotations.
Thank you both for the super fast replies!
My reasoning was that adhering to the structure of the actual .edf file and staying consistent with what e.g. Polyman and EDFbrowser do in their info dialogs would avoid unexpected behavior. As an example, opening a file with signals ["EEG", "EDF Annotations", "ECG"]
in EDFbrowser would show signal indices 1 and 3 for EEG and ECG, respectively. Since edfio uses zero-based indexing, one could expect that edf.signals[2]
returns the ECG signal. This consistency would be broken by skipping annotation signals in edfio's public API.
However, I agree that not exposing the annotation signals directly would be preferable and simplify things in edfio. The only situation I can think of that would require accessing individual annotation signals is handling a recording with multiple annotation signals, but that is probably rare enough to postpone until it actually comes up.
So if you think breaking consistency with other software is acceptable in this case, I'm happy to exclude annotation signals from the public API.
How does edfio currently handle annotation channels?
Re multiple annotation channels, why do you think this requires access to the underlying channels? You can keep returning one or multiple lists of annotations instead of the raw channels, or no?
Currently Edf.signals
contains annotation signals, Edf.drop_signals
considers them for index-based dropping and annotations from all annotation signals are aggregated into a single tuple (Edf.annotations
). The only way to modify them is via Edf.drop_annotations
(which should eventually be complemented by Edf.add_annotations
).
For multiple annotation signals, I think the most straightforward way to modify their annotations would be to implement drop and add methods on those signals. But that would not require them to be part of Edf.signals
, as we could make Edf._annotation_signals
public instead. Returning multiple tuples for Edf.annotations
in case of multiple annotation signals would be inconsistent with returning a single tuple for the (probably much more common) case of a single annotation signal.
My reasoning was that adhering to the structure of the actual .edf file and staying consistent with what e.g. Polyman and EDFbrowser do in their info dialogs would avoid unexpected behavior. As an example, opening a file with signals
["EEG", "EDF Annotations", "ECG"]
in EDFbrowser would show signal indices 1 and 3 for EEG and ECG, respectively.
I see. In EDFbrowser -> File -> Info -> Signals, the annotation channels are listed but only as information. You can't do anything with it. Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.
Regarding multiple annotation channels, EDFbrowser collects all annotations from all annotation channels, sorts them by onset time and presents them in just one list. The possibility of multiple annotation channels is NOT to provide some kind of grouping or differentiating. It's only to provide more storage space for annotations. (Instead of multiple annotation channels, you can also use a higher "samplerate" for the annotation channel in order to increase the storage space.)
I would really consider excluding annotations from regular signals. You are exposing an EDF implementation detail that should never be relevant for users. Usually, people record e.g. 64 EEG channels, and they might be surprised to find 65 channels in their file if they include annotations. Even in the case of multiple annotation signals, I'd just combine all annotations in a single tuple as @Teuniz suggested.
And IMO the internal channel numbering is also an implementation detail. For users, it should not matter if the annotation channel is the first, somewhere in the middle, or the last channel. If they record 64 EEG channels, these should always be numbered from 1 to 64 (OK, I know, this is Python, so we've got to live with channels 0 to 63 I guess, although I'd really prefer if they started at 1).
Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.
They are not listed, but the shown signal indices do consider them, which is why I'm worried about creating an inconsistency here.
Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.
They are not listed, but the shown signal indices do consider them, which is why I'm worried about creating an inconsistency here.
That was an oversight and I just corrected it and pushed it to Gitlab, so it will be in the next version. (Or simply pull from Gitlab and compile from source.)
Great, thank you for the quick answers and fix!
We'll proceed as follows:
Edf.signals
Edf.append_signals
append after the last ordinary signalEdf.drop_signals
"EDF Annotations"
as the label for an ordinary EdfSignal
One open question remains: Edf.num_signals
represents the corresponding EDF header field, which includes the number of annotation signals. Should it return the total number of signals to stay consistent with that or just the number of ordinary signals to be consistent with Edf.signals
?
I'd return the number of ordinary signals. If you want to expose that field from the header, you could create another attribute (but I doubt this is necessary).
Superseded by #25, #28
This simplifies ignoring annotations signals while iterating over a recording's signals. The name of the property is based on the language used by the EDF+ specs.