Add property `Edf.ordinary_signals`

hofaflo commented 7 months ago

This simplifies ignoring annotations signals while iterating over a recording's signals. The name of the property is based on the language used by the EDF+ specs.

hofaflo commented 7 months ago

Recap of our call today:

We create a class EdfBaseSignal with subclasses EdfSignal and EdfAnnotationsSignal. The latter has a property annotations instead of data. An EdfAnnotationsSignal does not allow to set certain header fields, e.g. label and physical dimension. An EdfSignal cannot have the label EDF Annotations.
In Edf.append_signals, the new signal(s) should be appended after the last ordinary signal, to ensure an annotation signal at the last signal index stays there. This helps to reduce the indexing confusion introduced by annotation signals.
We have not decided on the naming for properties allowing access to signals yet. There are at least two options:
- ordinary_signals (ordinary only) and signals (ordinary+annotations)
- signals (ordinary only) and all_signals (ordinary+annotations)
While the second one would expose the primary use case under the more obvious name, I prefer the first option, as it does not break the existing API. Also, as we have a method Edf.drop_signals and intend to introduce Edf.insert_signal, which both allow specifying signal indices, having a property signals that produces different indices could easily lead to errors. Maybe @Teuniz or @cbrnr want to chime in?

Teuniz commented 7 months ago

I agree with marcoross. The annotation channel(s) shouldn't be exposed to the user. Just provide a list/collection of annotations.

In other words, I prefer the second option without all_signals, thus like:

signals (ordinary only)

and no way to access the Annotation channel(s).

I can't think of a valid reason to provide direct access to the Annotations channel(s). It's also not a request I ever received.

cbrnr commented 7 months ago

@hofaflo can you explain the reasons why you would like to implement access to annotation channels like this? How are annotations currently handled? Are there any limitations that require this kind of raw access? I agree with both @marcoross and @Teuniz that edfio should not expose the raw annotation channels, but only regular signals plus a list of annotations.

hofaflo commented 7 months ago

Thank you both for the super fast replies!

My reasoning was that adhering to the structure of the actual .edf file and staying consistent with what e.g. Polyman and EDFbrowser do in their info dialogs would avoid unexpected behavior. As an example, opening a file with signals ["EEG", "EDF Annotations", "ECG"] in EDFbrowser would show signal indices 1 and 3 for EEG and ECG, respectively. Since edfio uses zero-based indexing, one could expect that edf.signals[2] returns the ECG signal. This consistency would be broken by skipping annotation signals in edfio's public API.

However, I agree that not exposing the annotation signals directly would be preferable and simplify things in edfio. The only situation I can think of that would require accessing individual annotation signals is handling a recording with multiple annotation signals, but that is probably rare enough to postpone until it actually comes up.

So if you think breaking consistency with other software is acceptable in this case, I'm happy to exclude annotation signals from the public API.

cbrnr commented 7 months ago

How does edfio currently handle annotation channels?

Re multiple annotation channels, why do you think this requires access to the underlying channels? You can keep returning one or multiple lists of annotations instead of the raw channels, or no?

hofaflo commented 7 months ago

Currently Edf.signals contains annotation signals, Edf.drop_signals considers them for index-based dropping and annotations from all annotation signals are aggregated into a single tuple (Edf.annotations). The only way to modify them is via Edf.drop_annotations (which should eventually be complemented by Edf.add_annotations).

For multiple annotation signals, I think the most straightforward way to modify their annotations would be to implement drop and add methods on those signals. But that would not require them to be part of Edf.signals, as we could make Edf._annotation_signals public instead. Returning multiple tuples for Edf.annotations in case of multiple annotation signals would be inconsistent with returning a single tuple for the (probably much more common) case of a single annotation signal.

Teuniz commented 7 months ago

My reasoning was that adhering to the structure of the actual .edf file and staying consistent with what e.g. Polyman and EDFbrowser do in their info dialogs would avoid unexpected behavior. As an example, opening a file with signals ["EEG", "EDF Annotations", "ECG"] in EDFbrowser would show signal indices 1 and 3 for EEG and ECG, respectively.

I see. In EDFbrowser -> File -> Info -> Signals, the annotation channels are listed but only as information. You can't do anything with it. Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.

Regarding multiple annotation channels, EDFbrowser collects all annotations from all annotation channels, sorts them by onset time and presents them in just one list. The possibility of multiple annotation channels is NOT to provide some kind of grouping or differentiating. It's only to provide more storage space for annotations. (Instead of multiple annotation channels, you can also use a higher "samplerate" for the annotation channel in order to increase the storage space.)

cbrnr commented 7 months ago

I would really consider excluding annotations from regular signals. You are exposing an EDF implementation detail that should never be relevant for users. Usually, people record e.g. 64 EEG channels, and they might be surprised to find 65 channels in their file if they include annotations. Even in the case of multiple annotation signals, I'd just combine all annotations in a single tuple as @Teuniz suggested.

cbrnr commented 7 months ago

And IMO the internal channel numbering is also an implementation detail. For users, it should not matter if the annotation channel is the first, somewhere in the middle, or the last channel. If they record 64 EEG channels, these should always be numbered from 1 to 64 (OK, I know, this is Python, so we've got to live with channels 0 to 63 I guess, although I'd really prefer if they started at 1).

hofaflo commented 7 months ago

Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.

They are not listed, but the shown signal indices do consider them, which is why I'm worried about creating an inconsistency here.

Teuniz commented 7 months ago

Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.

They are not listed, but the shown signal indices do consider them, which is why I'm worried about creating an inconsistency here.

That was an oversight and I just corrected it and pushed it to Gitlab, so it will be in the next version. (Or simply pull from Gitlab and compile from source.)

hofaflo commented 7 months ago

Great, thank you for the quick answers and fix!

We'll proceed as follows:

Skip annotation signals in Edf.signals
Let Edf.append_signals append after the last ordinary signal
Ignore annotation signal in Edf.drop_signals
Disallow using "EDF Annotations" as the label for an ordinary EdfSignal

One open question remains: Edf.num_signals represents the corresponding EDF header field, which includes the number of annotation signals. Should it return the total number of signals to stay consistent with that or just the number of ordinary signals to be consistent with Edf.signals?

cbrnr commented 7 months ago

I'd return the number of ordinary signals. If you want to expose that field from the header, you could create another attribute (but I doubt this is necessary).

hofaflo commented 7 months ago

Superseded by #25, #28

the-siesta-group / edfio

Add property `Edf.ordinary_signals` #16