raphaelvallat commented 1 year ago

Hi,

This PR introduces a first working implementation of the yasa.Hypnogram class, which will be the new standard way to deal with hypnograms in YASA moving forward, as discussed in #105.

This should greatly simplify with implementation of the performance evaluation pipeline (#78).

Remaining tasks

[x] Unit tests for 3, 4 and 5-stages hypnogram (see coverage report)
[x] Add Hypnogram.consolidate_stages method
[ ] ~Implement the Hypnogram.plot_hypnogram function. Here I'm not sure if we should re-implement from scratch, or instead use the existing yasa.plot_hypnogram function and add support to 2, 3 and 4-stages hypnograms. The latter will require adding the parameter n_stages to the function however.~ @remrama will implement in a separate PR.
[ ] Add compatibility with other YASA functions (e.g. detection). For now, users can simply use Hypnogram.as_int() to get a NumPy array with integer values.

@remrama I would appreciate your review on this (but no rush!). Let me know if you see any ideas for improvements and/or new class methods or properties that may be useful.

Thanks, Raphael

codecov-commenter commented 1 year ago

Codecov Report

Base: 91.64% // Head: 92.37% // Increases project coverage by +0.72% :tada:

Coverage data is based on head (b1cc5f9) compared to base (ca4a834). Patch coverage: 99.33% of modified lines in pull request are covered.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #116 +/- ## ========================================== + Coverage 91.64% 92.37% +0.72% ========================================== Files 22 23 +1 Lines 2753 3054 +301 ========================================== + Hits 2523 2821 +298 - Misses 230 233 +3 ``` | [Impacted Files](https://codecov.io/gh/raphaelvallat/yasa/pull/116?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Raphael+Vallat) | Coverage Δ | | |---|---|---| | [yasa/hypno.py](https://codecov.io/gh/raphaelvallat/yasa/pull/116/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Raphael+Vallat#diff-eWFzYS9oeXBuby5weQ==) | `98.19% <98.94%> (+0.97%)` | :arrow_up: | | [yasa/tests/test\_hypnoclass.py](https://codecov.io/gh/raphaelvallat/yasa/pull/116/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Raphael+Vallat#diff-eWFzYS90ZXN0cy90ZXN0X2h5cG5vY2xhc3MucHk=) | `100.00% <100.00%> (ø)` | | | [yasa/detection.py](https://codecov.io/gh/raphaelvallat/yasa/pull/116/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Raphael+Vallat#diff-eWFzYS9kZXRlY3Rpb24ucHk=) | `97.73% <0.00%> (-0.12%)` | :arrow_down: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Raphael+Vallat). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Raphael+Vallat)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

remrama commented 1 year ago

Btw, you're call but I would support merging this as-is and then adding "full functionality" in a separate PR. I'm not sure how big of a project that is, but if just the base class was in master I'd start some PRs with it.

remrama commented 1 year ago

Suggested method:

class Hypnogram:
    ...
    def summary(self): # or .get_dataframe
        """
        Return a pandas DataFrame summarizing epoch-level information.

        Column order and names are compliant with BIDS events files [BIDSevents]_
        and MNE events/annotations dataframes [MNEannotations]_.

        Returns
        -------
        summary : :py:class:`pandas.DataFrame`
            A dataframe containing epoch onset, duration, stage, etc.

        References
        ----------
        .. [BIDSevents] https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/05-task-events.html
        .. [MNEannotations] https://mne.tools/stable/glossary.html#term-annotations
        """
        data = {
            "onset": self.timedelta.total_seconds(),
            "duration": 1 / self.sampling_frequency,
            "value": self.as_int().to_numpy(),
            "description": self.hypno.to_numpy(),
            "epoch": 1 + np.arange(self.n_epochs),
        }
        if hypno.scorer is not None:
            data["scorer"] = hypno.scorer
        return pd.DataFrame(data)

raphaelvallat commented 1 year ago

@remrama agreed for the new method! What do you think about calling it Hypnogram.as_annotations() or Hypnogram.as_mne_annotations()?

Also, when start is not None, should onset be the actual datetime, or is it better to always have self.timedelta.total_seconds()? I'm guessing the latter is the standard MNE/BIDS format?

I'll wait for your reply, add this new method and then request your final approval before merging 🎉 !

remrama commented 1 year ago

What do you think about calling it Hypnogram.as_annotations() or Hypnogram.as_mne_annotations()?

Either are cool with me, although it's probably more of a BIDS/events emphasis than an MNE/annotation emphasis.

Also, when start is not None, should onset be the actual datetime, or is it better to always have self.timedelta.total_seconds()? I'm guessing the latter is the standard MNE/BIDS format?

Ya good question. Definitely want to always have the onset column as seconds from start (ie, self.timedelta.total_seconds()), because this is the BIDS-standard which I think takes precedent. MNE is a bit more ambiguous (as far as I can tell), and sometimes returns onset as seconds or timestamps.

I think in this case, if the Hypnogram includes timestamp info it should be added as an additional column rather than replacing onset. It's not totally clear to me what this would be called, according to BIDS docs, this might be "sample" but I'm not sure. If unclear, I think we could just call it "timestamp".

raphaelvallat commented 1 year ago

In the future, maybe we could go even one step further and add an output_type="mne" parameter to this method, which could be either "mne" (returns a mne.Annotations object), or "dataframe" (default BIDS-like dataframe, which is also compatible with EDFBrowser).

Question: why start epoch at 1 and not 0?

{"epoch": 1 + np.arange(self.n_epochs)}

If unclear, I think we could just call it "timestamp".

After second thought, I think that for the initial implementation I would only include onset in seconds, and not the actual timestamps.

raphaelvallat commented 1 year ago

Maybe epoch could be set as the index of the resulting dataframe? Would that break things w.r.t to BIDS/MNE annotations format?

>>> from yasa import Hypnogram
>>> hyp = Hypnogram(["W", "W", "LIGHT", "LIGHT", "DEEP", "REM", "WAKE"], n_stages=4)
>>> hyp.as_annotations()
       onset  duration  value description
epoch                                    
1        0.0      30.0      0        WAKE
2       30.0      30.0      0        WAKE
3       60.0      30.0      2       LIGHT
4       90.0      30.0      2       LIGHT
5      120.0      30.0      3        DEEP
6      150.0      30.0      4         REM
7      180.0      30.0      0        WAKE

remrama commented 1 year ago

Love the output_type idea!

Question: why start epoch at 1 and not 0?

Ya you called me out on this :) Start it at 0. I wasn't sure about that one, it's just that technically it is the 1st epoch, and the Python indexing can be confusing for new users. But you're right, submit to Python indexing.

I think that for the initial implementation I would only include onset in seconds, and not the actual timestamps.

Yep that's good. If someone really wants timestamp that can add it trivially with one more line.

Maybe epoch could be set as the index of the resulting dataframe?

That looks great! It won't break things, neither BIDS or MNE expect that column. It's not an incredibly useful column anyways, it's really just a row number... Leave it starting at 0 and so basically we're just renaming that axis to epoch.

raphaelvallat commented 1 year ago

Ready for your final review @remrama !

raphaelvallat commented 1 year ago

Merged to master. Thanks so much for your help on this! The next release is going to be 🔥

raphaelvallat / yasa

First implementation of the Hypnogram class #116

Codecov Report