Restructuring of electrode group

ajtritt commented 7 years ago

Originally reported by: Loren Frank (Bitbucket: lmfrnk, GitHub: Unknown)

[Hackathon] Need Need the ability to reference arbitrary subsets of electrodes to allow processing of subsets of electrodes while being able to describe and access the data of the electrodes that were used. Path Andrew/Oliver will create a proposal for restructuring electrode group to address this issue and send it out to decide on a specific solution. Goal This item should be completed before SFN.

[Use case example: Loren Frank] As it stands now, the electrode group does not have a list of individual electrode objects that make it up. It would be much easier to work with if one created an Electrode object for each electrode, grouped those together in an ElectrodeGroup where appropriate, and then saved a link / index / identifier for either the electrode or the electrodegroup in associated LFP or spikewaveform series.

Here's an example use case from our previously collected data: A four wire tetrode consists of four electrodes that are part of one ElectrodeGroup. From this four wire tetrode, one electrode is selected to record LFP data at a sampling rate of 1500 Hz, while spike snippets are saved at 30 KHz from all four channels. It would be helpful to have a single electrode object that could be associated with the LFP data and the same electrode object (in association with the other three electrodes of the tetrode) as part of an ElectrodeGroup that was associated with the spike waveforms.

As for the association, ideally things would be stored so the electrode / electrodegroup could be accessed directly from the data.

Bitbucket: https://bitbucket.org/lblneuro/nwb-schema/issue/6

ajtritt commented 7 years ago

Original comment by Andrew Tritt (Bitbucket: ajtritt, GitHub: ajtritt):

Proposal

Modify ElectrodeGroup to only have the following fields:
- description: a description of the electrode group
- device: a link to the device used to record from this group
- location: a description of the location of the electrode group
Create ElectrodeTable neurodata_type that has the following fields:
- id - a user-specified unique identifier
- coord - the xyz-coordinates of the channels (each coord as a separate column?)
- imp - the impedance of the channel
- location - the location of the channel within the subject e.g. brain region
- filtering - per-channel filtering
- description - per-channel description
- group - the name of the ElectrodeGroup instance that this channel belongs to
  - this could also be a reference, depending on how users want to access ElectrodeGroup information
Require NWBFile have a global ElectrodeTable
Modify ElectricalSeries, FeatureExtraction, and SpikeEventSeries with the following change:
- Remove required link to an ElectrodeGroup
- Require a dataset that stores a region reference to the channels in the global ElectrodeTable that were used to generate the ElectricalSeries data

Advantages

Enable referencing of arbitrary groupings of channels
- e.g. all four channels from a tetrode, or a single channel from a tetrode
Referencing is explicit
- can retrieve data of interest using standard h5py library calls
normalizes channel data
adding channels to an ElectrodeGroup is achieved simply by adding a row to the ElectrodeTable
organizing all channels in a single table enables easier querying across all fields specified for a channel e.g. query by brain region, query by xyz-coordinates, etc

Potential Disadvantages

Each channel can only be in one ElectrodeGroup. Since ElectrodeGroups are used to describe physical organizations of electrodes, supporting only a unique organization appears reasonable and is consistent with the original organization.
Lookup of channles that belong to a particular ElectrodeGroup requires linear search in ElectrodeTable. Since the ElectrodeTable will be relatively small and since we could provide a short-cut function in the API for this, this should only be a minor problem and appears less problematic than adding an "inverse" reference from the ElectrodeGroup to the table, as this would mean denormalization (i.e., duplication) of data, which would lead to problems with bookkeeping and potential conflicting data.
Lookup of ElectrodeGroups that belong to a particular Device requires linear search over the ElectrodeGroups (with potential subsequent search over the ElectrodeTable to identify channels). See previous comment, why this seems acceptable.

Additional Notes

This proposal is in a way a mix between the original and the current structure of ElectrodeGroup
Implementation of this proposal requires support for specification of tables in the specification language. This being addressed in proposal https://github.com/NeurodataWithoutBorders/nwb-schema/issues/13

oruebel commented 7 years ago

Instead of storing ElectrodeGroup data in groups, should the ElectrodeGroups also be converted to a table to ease search and enable simple referencing of multiple ElectrodeGroups?

choldgraf commented 7 years ago

It's hard for me to give really meaningful feedback here since I'm not familiar with the this codebase, but just a quick thought: I can see some semantic confusion between ElectrodeGroup and ChannelTable. I know that channels are not always electrodes, but many scientists use the two interchangeably.

Is the idea that ChannelTable is for metadata on all channels? If so, maybe a better name would be ChannelMetadata or something more semantically descriptive. It'd also make it clear that it's not meant for storing the actual channel data (which I assume would be stored as Electrode objects in an ElectrodeGroup

KrisBouchard commented 7 years ago

Just to confirm: this allows multiple electrodes to be easily referenced as contributing to a given event, e.g., a sorted unit?

Also, am I correct in that ChannelTable will be a compound data type?

ajtritt commented 7 years ago

@KrisBouchard You are correct. Having all channels in a single table will allow efficient access of arbitrary groupings.

@choldgraf Yes, ChannelTable is for storing metadata about channels. ElectrodeGroup is for storing metadata about groupings of channels, such as an ECoG array or probe. The data generated by recording from channels/electrode groups is stored elsewhere

choldgraf commented 7 years ago

So why not call it ChannelGroup then, for naming consistency? E.g., you could group channels that were all analog inputs and not related to electrodes, no?

ajtritt commented 7 years ago

The term "electrode group" was taken from the original specification. The use of the term "channel" was taken from its use in other types that reference electrode groups, such as ElectricalSeries, SpikeEventSeries

This type is specific to electrodes, so I would like to avoid making it a catch-all for any analog signal.

Do you think renaming ChannelTable to ElectrodeTable would avoid confusion?

KrisBouchard commented 7 years ago

I think ElectrodeTable is better. Channels can be assigned to multiple data collection modalities.

choldgraf commented 7 years ago

I don't feel strongly about one or the other, but I think they should be consistent. Users shouldn't have to guess when it's Electrode vs. Channel. I think my other concern is just that it's a somewhat un-descriptive term. If you were a new user who knew nothing about NWB, and you saw the object ElectrodeTable or ElectrodeGroup, you might think that was the actual data (as I did). If the naming of the objects is set for posterity's sake, it should be very clear in the documentation at least.

ajtritt commented 7 years ago

@KrisBouchard I updated the proposal

@choldgraf all NWB types will inherit the help attribute from the NWBContainer base type. This attribute will provide some explanation of what a group or dataset is. We could name it Electrodes or ElectrodeMetadata if that will provide more clarity.

lfrank commented 7 years ago

Hi all. Just for everyone's reference, our data acquisition system and software (SpikeGadgets hardware, trodes software) calls these things "NTrodes." I'm fine leaving things as Electrodes, but I think NTrode is actually a more informative name.

Also, I would amend the requirements to say that if one or more ElectrodeGrous are defined, then there has to be a ElectrodeTable, but since some NWB files will have only behavioral data or potentially only imaging data, I'm not sure it makes sense to require a ElectrodeTable for all files.

tjd2002 commented 7 years ago

I don't think 'filtering' should appear in ElectrodeGroup or ElectrodeTable, since this is a property of the signal processing path, rather than of the electrode, and may be different for different recordings derived from the same physical electrode channel. For instance, in the original use case given by Loren, we save LFP traces (bandpassed 1-400Hz) and spike snippets (bandpassed 600-6000Hz) recorded from overlapping sets of electrodes.

This is not an issue specific to the proposed reorganization, but now that we are being explicit about describing all electrodes in a single location, and about the ElectrodeGroup reflecting only the physical grouping of electrodes (e.g. into a tetrode or polytrode), I think it makes sense to address it here. (I would also be happy to open a separate issue to track this discussion). All the other properties we include in ElectrodeGroup and ElectrodeTable (coordinates, id, impedance, source device, location) are already properties of the physical electrode and independent of the signal processing chain. (For instance, we don't include sampling rate in the ElectrodeTable).

There is of course some filtering applied at the hardware level (e.g. by the headstage, in our use case), and it could be useful to describe that on a per-channel level, but in general it would seem to make more sense for 'filtering' to be a property of an ElectricalSeries.

ajtritt commented 7 years ago

@tjd2002 Thanks for bringing this up Tom. In the original schema, filtering was a top-level field that was not specific to either.

Here is the original documentation:

"Includes filtering type and parameters, frequency fall- off, etc. If this changes between TimeSeries, filter description should be stored as a text attribute for each TimeSeries. If this changes between TimeSeries, filter description should be stored as a text attribute for each TimeSeries."

Our thinking in making filtering at the level of channels/electrodes was we thought it was too restrictive to have a single, top-level field. From what you are saying (which is even acknowledged in the documentation), it makes more sense to have it as a field respective to processing pipelines.

Given this, does it make sense to have it anywhere associated with ElectrodeTable or ElectrodeGroup i.e. can we remove it completely?

neuromusic commented 7 years ago

"From what you are saying (which is even acknowledged in the documentation), it makes more sense to have it as a field respective to processing pipelines."

An example dataset:

an ElectricalSeries from 16 channels at 30kHz with only hardware filtering
an EventWaveform sampled from channels 1-4 in (1)
an EventWaveform sampled from channels 5-8 in (1)
an EventWaveform sampled from channels 9-12 in (1)
an EventWaveform sampled from channels 13-16 in (1)
a FilteredEphys from the same 16 channels as in (1) at 1000Hz after applying a bandpass filter (in software)

Currently, FilteredEphys doesn't offer much for programmatic access to filtering parameters (my own emphasis):

Ephys data from one or more channels that has been subjected to filtering. Examples of filtered data include Theta and Gamma (LFP has its own interface). FilteredEphys modules publish an ElectricalSeries for each filtered channel or set of channels. The name of each ElectricalSeries is arbitrary but should be informative. The source of the filtered data, whether this is from analysis of another time series or as acquired by hardware, should be noted in each’s TimeSeries::description field. There is no assumed 1::1 correspondence between filtered ephys signals and electrodes, as a single signal can apply to many nearby electrodes, and one electrode may have different filtered (e.g., theta and/or gamma) signals represented.

I'm with @tjd2002 that this info should be machine-readable somewhere (say as Dimension Scales on the FilteredEphys datatype? or deprecating FilteredEphys for a single ElectricalSeries type & just using Dimension Scales to determine filtering or other channel-wise processing metadata?)

KrisBouchard commented 7 years ago

My sense is that the term 'NTrodes' is not well known across many groups (I have never heard it before and we are looking to use community accepted, simple terms).

I agree with the sentiment that the FilteredEphys info should be more machine-readable. I will note that 'filtering' will be applicable to many neurophysiology types (e.g., optical physiology and electrophysiology) [as discussed at the hackathon]. It seems that a solution to this that is general enough to cover multiple signal types would be advantageous, and as such, divorcing it from the physical description of the electrodes seems to make sense to me.

tjd2002 commented 7 years ago

I'd like to withdraw my proposal to remove the 'filtering' entry in the ElectrodeTable. I see the logic and value of capturing an instrument's per-channel filter settings (e.g. in neuromusic's example above). I suggest instead that we make explicit that this field refers to 'hardware filtering' (i.e. that done in the recording instrument at acquisition time).

@KrisBouchard @neuromusic I think that making 'filtering' information more machine-readable is desirable, but a big can of worms, and that punting everything into free-form text seems reasonable, esp for the upcoming release. (I think it is properly part of a larger discussion about capturing information about processing pipelines.)

I agree with @neuromusic that specifying filtering/processing metadata on a per-channel basis in an 'ElectricalSeries' (and therefore in classes like 'FilteredEphys', 'LFP', 'SpikeEventSeries' and 'SpikeWaveform') would be valuable, but I think that discussion should happen independently of the changes to ElectrodeGroup proposed here.

@KrisBouchard @loren I think 'ElectrodeGroup' captures the sense well, and is more general (we wouldn't call a group of skull EEG electrodes an 'nTrode', e.g.)

neuromusic commented 7 years ago

I had a conversation with our ephys folks here & one point that was raised was the extent to which the ElectrodeTable will support the arbitrary groupings that ElectrodeGroup currently supports... that is, can the user define additional columns that let them label channels with additional fields?

ajtritt commented 7 years ago

@neuromusic Yes, through the existing means of extension

neuromusic commented 7 years ago

aha. perfect.

Justin

On Mon, Sep 11, 2017 at 11:31 AM, Andrew Tritt notifications@github.com wrote:

@neuromusic https://github.com/neuromusic Yes, through the existing means of extension

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NeurodataWithoutBorders/nwb-schema/issues/6#issuecomment-328610535, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMBr5gJSbhhBXIbvNRctkMUBEv_Ewxpks5shXaCgaJpZM4OtUhn .

oruebel commented 6 years ago

@ajtritt can this issue be closed?

NeurodataWithoutBorders / nwb-schema

Restructuring of electrode group #6