the-siesta-group / edfio

Read and write EDF/EDF+ files.
Apache License 2.0
25 stars 5 forks source link

Add property `Edf.ordinary_signals` #16

Closed hofaflo closed 7 months ago

hofaflo commented 7 months ago

This simplifies ignoring annotations signals while iterating over a recording's signals. The name of the property is based on the language used by the EDF+ specs.

hofaflo commented 7 months ago

Recap of our call today:

Teuniz commented 7 months ago

I agree with marcoross. The annotation channel(s) shouldn't be exposed to the user. Just provide a list/collection of annotations.

In other words, I prefer the second option without all_signals, thus like:

and no way to access the Annotation channel(s).

I can't think of a valid reason to provide direct access to the Annotations channel(s). It's also not a request I ever received.

cbrnr commented 7 months ago

@hofaflo can you explain the reasons why you would like to implement access to annotation channels like this? How are annotations currently handled? Are there any limitations that require this kind of raw access? I agree with both @marcoross and @Teuniz that edfio should not expose the raw annotation channels, but only regular signals plus a list of annotations.

hofaflo commented 7 months ago

Thank you both for the super fast replies!

My reasoning was that adhering to the structure of the actual .edf file and staying consistent with what e.g. Polyman and EDFbrowser do in their info dialogs would avoid unexpected behavior. As an example, opening a file with signals ["EEG", "EDF Annotations", "ECG"] in EDFbrowser would show signal indices 1 and 3 for EEG and ECG, respectively. Since edfio uses zero-based indexing, one could expect that edf.signals[2] returns the ECG signal. This consistency would be broken by skipping annotation signals in edfio's public API.

However, I agree that not exposing the annotation signals directly would be preferable and simplify things in edfio. The only situation I can think of that would require accessing individual annotation signals is handling a recording with multiple annotation signals, but that is probably rare enough to postpone until it actually comes up.

So if you think breaking consistency with other software is acceptable in this case, I'm happy to exclude annotation signals from the public API.

cbrnr commented 7 months ago

How does edfio currently handle annotation channels?

Re multiple annotation channels, why do you think this requires access to the underlying channels? You can keep returning one or multiple lists of annotations instead of the raw channels, or no?

hofaflo commented 7 months ago

Currently Edf.signals contains annotation signals, Edf.drop_signals considers them for index-based dropping and annotations from all annotation signals are aggregated into a single tuple (Edf.annotations). The only way to modify them is via Edf.drop_annotations (which should eventually be complemented by Edf.add_annotations).

For multiple annotation signals, I think the most straightforward way to modify their annotations would be to implement drop and add methods on those signals. But that would not require them to be part of Edf.signals, as we could make Edf._annotation_signals public instead. Returning multiple tuples for Edf.annotations in case of multiple annotation signals would be inconsistent with returning a single tuple for the (probably much more common) case of a single annotation signal.

Teuniz commented 7 months ago

My reasoning was that adhering to the structure of the actual .edf file and staying consistent with what e.g. Polyman and EDFbrowser do in their info dialogs would avoid unexpected behavior. As an example, opening a file with signals ["EEG", "EDF Annotations", "ECG"] in EDFbrowser would show signal indices 1 and 3 for EEG and ECG, respectively.

I see. In EDFbrowser -> File -> Info -> Signals, the annotation channels are listed but only as information. You can't do anything with it. Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.

Regarding multiple annotation channels, EDFbrowser collects all annotations from all annotation channels, sorts them by onset time and presents them in just one list. The possibility of multiple annotation channels is NOT to provide some kind of grouping or differentiating. It's only to provide more storage space for annotations. (Instead of multiple annotation channels, you can also use a higher "samplerate" for the annotation channel in order to increase the storage space.)

cbrnr commented 7 months ago

I would really consider excluding annotations from regular signals. You are exposing an EDF implementation detail that should never be relevant for users. Usually, people record e.g. 64 EEG channels, and they might be surprised to find 65 channels in their file if they include annotations. Even in the case of multiple annotation signals, I'd just combine all annotations in a single tuple as @Teuniz suggested.

cbrnr commented 7 months ago

And IMO the internal channel numbering is also an implementation detail. For users, it should not matter if the annotation channel is the first, somewhere in the middle, or the last channel. If they record 64 EEG channels, these should always be numbered from 1 to 64 (OK, I know, this is Python, so we've got to live with channels 0 to 63 I guess, although I'd really prefer if they started at 1).

hofaflo commented 7 months ago

Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.

They are not listed, but the shown signal indices do consider them, which is why I'm worried about creating an inconsistency here.

Teuniz commented 7 months ago

Instead, when you go to Signals -> Add, the annotation signals are not listed, that wouldn't make sense IMO.

They are not listed, but the shown signal indices do consider them, which is why I'm worried about creating an inconsistency here.

That was an oversight and I just corrected it and pushed it to Gitlab, so it will be in the next version. (Or simply pull from Gitlab and compile from source.)

hofaflo commented 7 months ago

Great, thank you for the quick answers and fix!

We'll proceed as follows:

One open question remains: Edf.num_signals represents the corresponding EDF header field, which includes the number of annotation signals. Should it return the total number of signals to stay consistent with that or just the number of ordinary signals to be consistent with Edf.signals?

cbrnr commented 7 months ago

I'd return the number of ordinary signals. If you want to expose that field from the header, you could create another attribute (but I doubt this is necessary).

hofaflo commented 7 months ago

Superseded by #25, #28