kymata-atlas / kymata-core

Core Kymata codebase, including statistical analysis and plotting tools
https://kymata.org
MIT License
5 stars 0 forks source link

`ExpressionSet` can handle results from sensor data #63

Closed caiw closed 11 months ago

caiw commented 11 months ago

At the moment, ExpressionSets only deal in source-space (hexel) data. But it's also use to work with sensor-space data, especially as it's much smaller and faster to work with.

ExpressionSets should therefore expand to incorporate sensor-space data.

Things to discuss

  1. [x] Hemisphere splitting.
    • In source space we explicitly separate left vs right, and this is reflected in the expression plots.
    • In sensor space we don't (necessarily) have left versus right, so will have to deal with this differently.
    • In ExpressionSet, we use xarray.DataSets to store the ensemble data, with different layers for .left and .right. The good news is that if we had a different data structure with a single layer, all the same operations would still work (this is the beauty of xarray). However we would need to incorporate this both in to the datastructure, and into the .nkg file format.
  2. [x] Sensor names for filtering
    • In hexel space, we have canonical integer IDs for each hexel. In sensor space, sensors have specific names, which are strings.
    • In additional to uniquely identifying the sensors, these names include the type of sensor, e.g. EEG vs magnetometers vs gradiometers. This might allow for quick filtering of results by sensor type (e.g. EEG vs MEG). But how much of this kind of interface do we want to include in ExpressionSet?
  3. [x] Bad channels?
  4. [x] How should the ExpressionSet record whether it's sensor-or source-space?
    • Do we want to enforce any kind of naming-convension on this, à la MNE?
    • (@caiw doesn't like idea)
    • we will need to add a test to make sure that ExpressionSets of different channel dimensions can't be added together
  5. [x] File format
    • The data will be quite different between sensor and hexel space, in terms of the .nkg file format (see #13, #19), which currently stores left- and right-henisphere data in separate files. We will need to think about this.
    • The final big thing to decide on is: are there just two natural forms of ExpressionSet, sensor and hexel, or is there the possibility for extra versions in the future?
    • Does the move to OPM-MEG or other MEG/EEG models or other hexel decimations relate to this?
    • If there are just two, we can implement them as special cases. If it may expand further in future, might be worth thinking about a more general structure.
caiw commented 11 months ago

@neukym Here's the issue for the thing we discussed just now. Please feel free to add extra things-to-decide at the bottom, and we can discuss them and make decisions at a later date.

neukym commented 11 months ago

Great draft. I can answer one of the above now: we won't need to include bad channels - these will have been interpolated during preprocessing, so we should always have the full array.

caiw commented 11 months ago

Decision: We need L/R info for plotting, but this could be supplied in config files or helper functions, not in the .nkg files themselves.

caiw commented 11 months ago

Decision: sensor names stored in index, with their "MEG", "EEG" prefixes

caiw commented 11 months ago

Decision: load function needs to know if it's loading sensor or source space. this should be stored in the file

caiw commented 11 months ago

Decision: it's just source (with hemispheres) / sensor(without hemis). EEG/MEG/OPM are just examples of sensor data. Don't need to future-proof for ECoG etc.