mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.66k stars 1.31k forks source link

Inconsistent handling and documentation of pupil size units #12756

Open sappelhoff opened 1 month ago

sappelhoff commented 1 month ago

Proposed documentation enhancement

In https://mne.tools/dev/documentation/implementation.html, it says that gaze and pupil data are stored in arbitrary units.

However, in https://mne.tools/dev/auto_tutorials/io/70_reading_eyetracking_data.html, it says:

When possible, MNE-Python will internally convert and store eyetracking data according to an SI unit (for example radians for position data, and meters for pupil size).

When calling raw.get_data(picks=["pupil"], unit="mm"), an error is raised ... however from inspecting my data when getting them without unit, the values DO seem to be in "m" and I can convert to "mm" manually by * 1e3.

EDIT: one important caveat: This is data from a Tobii Pro Spectrum 600 that I imported manually into mne ... and I set the units using mne.preprocessing.eyetracking.set_channel_types_eyetrack

scott-huberty commented 1 month ago
  1. it says that gaze and pupil data are stored in arbitrary units.

For Eyelink devices (which I guess were our only use-case at the time of writing the doc), eyegaze data are reported in pixels and pupil data are reported in an arbitrary unit (not meters or mm), which, if I recall correctly, is why we wrote that.

  1. However, in https://mne.tools/dev/auto_tutorials/io/70_reading_eyetracking_data.html, it says:

    When possible, MNE-Python will internally convert and store eyetracking data according to an SI unit (for example radians for position data, and meters for pupil size).

You are right that this is incorrect. For example, If a user explicitly converts the eyegaze units from pixels to radians using convert_units, then the eyegaze unit will be SI. But as of now at-least, MNE never tries to convert to an SI unit under the hood on behalf of the user.

And we have no function to convert non-SI pupil data to SI. We wanted to eventually implement this, but have yet to get around to it...

  1. When calling raw.get_data(picks=["pupil"], unit="mm"), an error is raised ... however from inspecting my data when getting them without unit, the values DO seem to be in "m" and I can convert to "mm" manually by * 1e3. ... This is data from a Tobii Pro Spectrum 600 that I imported manually into mne ...

sappelhoff commented 1 month ago

Thanks for your responses, Scott!

tobii stores it in mm, and I have told MNE explicitly via set_channel_types_eyetrack :-)

Not directly related, but maybe it's worth reaching out to the Tobii folks to see if they have or could write a Python reader for their devices (a la mffpy), that we could then use in an official read_raw_tobii reader?

Whenever I have used tobii, it wasn't done in the same way as with the Eyelink. That is, afaik there is no Tobii eyetracking data format. You can get the variables in realtime via the SDK and then usually save it to file yourself (most often CSV / TSV), only including the aspects of the data that you need. πŸ€” but having that said, I am also not too experienced with Tobii. This current data I am working on has been recorded via LSL, so it's in XDF format.

scott-huberty commented 1 month ago

tobii stores it in mm, and I have told MNE explicitly via set_channel_types_eyetrack :-)

ah, so we have a bug to fix? πŸ™‚

What error is thrown when you do raw.get_data(picks=["pupil"], unit="mm")

Because I would expect that

set_channel_types_eyetrack(
   mapping={"pupil_right": ("pupil", "mm", "right")}
)

should convert the pupil channel data from millimeters to meters, and set the unit to FIFF_UNIT_M.

Then, I would expect raw.get_data(picks=["pupil"], unit=["mm"]) to check that the pupil data can be converted to millimeters (i.e., the unit should be FIFF_UNIT_M), and then convert the data without problems.

sappelhoff commented 1 month ago

ah, so we have a bug to fix? πŸ™‚

I think so πŸ‘Ό

Here is a MWE that shows the error. I marked all the bits that I find unexpected.

# %%
import numpy as np
import mne

sfreq = 100
n = sfreq + 0  # 10 seconds
ch_names = ["my_pupil"]
ch_types = "pupil"
info = mne.create_info(ch_names, sfreq, ch_types)
mydata = np.atleast_2d(np.random.randint(2, 8, n))
raw = mne.io.RawArray(mydata, info, verbose=False)
ch_type_mapping_eye = {
    "my_pupil": ("pupil", "mm", "right"),
}
raw = mne.preprocessing.eyetracking.set_channel_types_eyetrack(raw, ch_type_mapping_eye)

# plots units in AU ... although we set mm ... unexpected
raw.plot()

# ValueError: mm is not a valid unit for pupil, use a sub-multiple of AU instead. ... unexpected
raw.get_data(picks="pupil", units="mm")

# True, as defined and expected
assert mydata.mean() > 2 and mydata.mean() < 8

# AssertionError ... unexpected given the ValueError we got when we tried to get the units in mm
mydata_from_raw = raw.get_data()
assert mydata_from_raw.mean() > 2 and mydata_from_raw.mean() < 8

# True: mean is my_data / 1e3 ... unexpected, because it shows that the units WERE converted from mm to m
assert mydata.mean() / 1e3 == mydata_from_raw.mean()

# %%
scott-huberty commented 1 month ago

Issue 1

plots units in AU ... although we set mm ... unexpected raw.plot()

This issue has come up before, We discussed it in https://github.com/mne-tools/mne-python/issues/11879#issue-1850032365 but I have not been able to get to it yet.

The upshot is that until Eyetracking data came along, I think MNE could always hardcode a reasonable scaling for a given channel based on its expected SI unit (e.g. EEG -> volts -> 1e6).

But eye-tracking data can be stored in a standard unit (e.g. Tobii reports pupil size in mm) or in arbitrary units that can't be trivially converted to SI. (e.g. Eyelink reports pupil size in AU).

I guess for pupil channels, we chose a default plot scaling that would be best for data that were in arbitrary units..

Currently, the proposed solution is to create a mapping between each possible unit and a suitable scaling factor, then use this mapping in raw.plot, but we haven't done this yet (it will probably be a lot of work!).


Issue 2

ValueError: mm is not a valid unit for pupil, use a sub-multiple of AU instead. ... unexpected raw.get_data(picks="pupil", units="mm")

I think this suffers from the same issue... Since get_data calls

https://github.com/mne-tools/mne-python/blob/868746b427f8123dd0e6b4594a2bd18c7fa49331/mne/io/base.py#L943-L946

which eventually calls _handle_defaults:

mne.defaults._handle_default("si_units")
output ``` {'mag': 'T', 'grad': 'T/m', 'eeg': 'V', 'eog': 'V', 'ecg': 'V', 'emg': 'V', 'misc': 'AU', 'seeg': 'V', 'dbs': 'V', 'dipole': 'Am', 'gof': 'GOF', 'bio': 'V', 'ecog': 'V', 'hbo': 'M', 'hbr': 'M', 'ref_meg': 'T', 'fnirs_cw_amplitude': 'V', 'fnirs_fd_ac_amplitude': 'V', 'fnirs_fd_phase': 'rad', 'fnirs_od': 'V', 'csd': 'V/mΒ²', 'whitened': 'Z', 'gsr': 'S', 'temperature': 'C', 'eyegaze': 'AU', 'pupil': 'AU'} ```

So even though your pupil channel correctly has a unit FIFF_UNIT_M, MNE won't consider converting the data to any unit that isn't a submultiple of AU... @larsoner does my hypothesis sound right to you?


Issues 3-4

AssertionError ... unexpected given the ValueError we got when we tried to get the units in mm mydata_from_raw = raw.get_data() assert mydata_from_raw.mean() > 2 and mydata_from_raw.mean() < 8

True: mean is my_data / 1e3 ... unexpected, because it shows that the units WERE converted from mm to m assert mydata.mean() / 1e3 == mydata_from_raw.mean()

I agree that this is confusing! The ValueError suggested that your data is in AU, but set_channel_types_eyetrack converted your data to meters (an SI unit):

https://github.com/mne-tools/mne-python/blob/868746b427f8123dd0e6b4594a2bd18c7fa49331/mne/preprocessing/eyetracking/eyetracking.py#L110-L112

I think/hope that by solving the first two issues, this would be solved too?

Side note, we should add a few instances of logger.info to set_channel_types_eyetrack so that it is explicitly clear when and how we are converting data on the behalf of the user.

sappelhoff commented 1 month ago

Thanks for looking into it, Scott. This all makes sense to me and I am aware that fixing some of these things will involve a lot of work. I cannot offer this at this point, unfortunately, but hopefully either you, me, or someone else can pick this up in the future.

I think/hope that by solving the first two issues, this would be solved too?

I think so, too!

larsoner commented 1 month ago

Changing the plotting functions to allow multiple unit types per channel type is doable but I suspect will be (very) hard to get right.

In the meantime, a potential in-between solution could be to change the plotting functions from assuming that the SI unit for eyegaze and pupil are AU to, say, rad and m. Then plotting will work assuming you've done the leg work to convert your AU data to physical units.

This isn't entirely unreasonable, assuming we can settle on the reasonable "standard" SI units. I think m for pupil makes sense. But for eye gaze I guess in theory you could say radians, meters, or pixels (which probably isn't SI) could potentially make sense. Is there a standard in the literature for how people typically report and/or plot eyegaze data? For example if 90% of publications show eye gaze in visual angle then rad makes sense. And we can just standardize around this in MNE-Python as much as possible, showing people how to get their Tobii or Eyelink data to be in these standard SI units. Then when wacky stuff happens like you actually want to plot gaze in pixels you have to do some extra work it's more acceptable...

scott-huberty commented 1 month ago

Thanks both for your input.

In the meantime, a potential in-between solution could be to change the plotting functions from assuming that the SI unit for eyegaze and pupil are AU to, say, rad and m.

I think I am +1 for this:

So to be consistent with ourselves, we can assume the pupil size unit is in meters. The downside to this is that the native values stored by Eyelink are px and au, so plotting won't work well out of the box. They will have to converts the units or passes scalings="auto".

I think m for pupil makes sense. ... Is there a standard in the literature for how people typically report and/or plot eyegaze data?

Most publications report pupil size in m and eyegaze data in degrees. So I agree rad is the preferred SI unit for eyegaze.

We have a function to help users convert their eyegaze data to radians. As long as the visual angle isn't too extreme (<30ΒΊ), the conversion should be approximately linear.

We don't have a function to help Eyelink users convert their pupil size data from AU to meters. It isn't trivial.

And we can just standardize around this in MNE-Python as much as possible, showing people how to get their Tobii or Eyelink data to be in these standard SI units

Agree!! We have a few different tutorials that cover all of our current functionality. But I'm open to suggestions on how we can improve the documentation.

larsoner commented 1 month ago

The downside to this is that the native values stored by Eyelink are px and au, so plotting won't work well out of the box. They will have to converts the units or passes scalings="auto".

This part I'm okay with. People can standardize around one of their first steps being to do this conversion (it probably already is an early step anyway!) then things should work okay.

@scott-huberty WDYT about trying to adapt all existing eye tracking tutorials / examples to do this step as early as possible, and then (maybe in the same PR) try changing mne.defaults._handle_default("si_units") and related stuff to assume that data are in SI units, and see if it simplifies some end-user stuff? I'm hoping it's a < 200 line PR that makes the examples simpler rather than more complicated...

scott-huberty commented 1 month ago

@scott-huberty WDYT about trying to adapt all existing eye tracking tutorials / examples to do this step as early as possible, and then (maybe in the same PR) try changing mne.defaults._handle_default("si_units") and related stuff to assume that data are in SI units, and see if it simplifies some end-user stuff?

Sure!

I'm hoping it's a < 200 line PR that makes the examples simpler rather than more complicated...

Challenge accepted πŸ‹οΈ