mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.73k stars 1.33k forks source link

The duration calculation for display in `BaseRAW._repr_html_` doesn't account for the lenght of the last sample. #12954

Closed leorochael closed 1 week ago

leorochael commented 1 week ago

Description of the problem

The calculation of the duration display in BaseRAW._repr_html_ assumes that the length of the acquisition is the timestamp of the last sample which, for a high sampling rate, is close to the actual duration, but ignores the length of the last sample.

Depending on the acquisition length and the sampling frequency, this might result in a weird display.

Steps to reproduce

In a Jupyter Notebook, open an EDF containing data that is sampled at 500 hz and lasts exactly one hour and then display it. E.g:

import mne.io
raw_edf = mne.io.read_raw_edf("path-to-1h.edf", preload=False)
raw_edf

You'll get an output containing:

  • Acquisition
    • Duration: 00:59:60 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Instead of what would be expected:

  • Acquisition
    • Duration: 01:00:00 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Link to data

No response

Expected results

As mentioned, I expect to get:

  • Acquisition
    • Duration: 01:00:00 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Actual results

Instead I get:

  • Acquisition
    • Duration: 00:59:60 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Additional information

The code that calculates the duration is in BaseRaw._repr_html_:

        # https://stackoverflow.com/a/10981895
        duration = timedelta(seconds=self.times[-1])
        hours, remainder = divmod(duration.seconds, 3600)
        minutes, seconds = divmod(remainder, 60)
        seconds += duration.microseconds / 1e6
        seconds = np.ceil(seconds)  # always take full seconds

        duration = f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d}"

To fix it, instead of calculating the duration based on the time of the last sample, use the total amount of samples and divide it by the sampling frequency. E.g:

        duration = timedelta(
            seconds=len(self.times) / self.info['sfreq'],
        )
        # https://stackoverflow.com/a/10981895
        hours, remainder = divmod(duration.seconds, 3600)
        minutes, seconds = divmod(remainder, 60)
        seconds += duration.microseconds / 1e6
        seconds = np.ceil(seconds)  # always take full seconds

        duration = f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d}"

By the way, Pandas has a very nice representation for time deltas, if you want to remove the hours/minutes/seconds calculation (though that would mean pandas needs to be installed, even though it's an optional dependency for MNE):

>>> str( pd.Timedelta(3597, 's') )
'0 days 00:59:57'

Here is the mne.sys_info() (not that it makes a difference):

Platform Linux-6.1.85+-x86_64-with-glibc2.35 Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] Executable /usr/bin/python3 CPU x86_64 (2 cores) Memory 12.7 GB

Core ├☑ mne 1.8.0 (latest release) ├☑ numpy 1.26.4 (OpenBLAS 0.3.23.dev with 2 threads) ├☑ scipy 1.14.1 └☑ matplotlib 3.8.0 (backend=module://matplotlib_inline.backend_inline)

Numerical (optional) ├☑ sklearn 1.5.2 ├☑ numba 0.60.0 ├☑ nibabel 5.3.2 ├☑ pandas 2.2.3 ├☑ h5py 3.12.1 └☐ unavailable nilearn, dipy, openmeeg, cupy, h5io

Visualization (optional) ├☑ ipywidgets 8.1.5 └☐ unavailable pyvista, pyvistaqt, vtk, qtpy, ipympl, pyqtgraph, mne-qt-browser, trame_client, trame_server, trame_vtk, trame_vuetify

Ecosystem (optional) └☐ unavailable mne-bids, mne-nirs, mne-features, mne-connectivity, mne-icalabel, mne-bids-pipeline, neo, eeglabio, edfio, mffpy, pybv

welcome[bot] commented 1 week ago

Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴