Closed remrama closed 1 year ago
YES!! This will be so helpful also for #105. I also vote for a). I would also add the option to return 2, 3, 4 or 5 stages, i.e. the original hypnogram is always generated with 5 stages but then the function includes a module to combine the sleep stages together.
This next YASA version should be named version REMRAMA :) We need to find a way to make you an official YASA core maintainer / developer.
I think it would be nice to have the ability to quickly simulate a hypnogram of arbitrary length. I find it useful for testing things out, especially when needing a large number of hypnograms (e.g., to simulate group results). Having a single function to do this is quick, and the randomness might help to hit test cases that wouldn't come up when making more stereotyped repeated arrays.
I've also been thinking this might serve as a good type of permutation testing for evaluating automated-staging performance. It's typical in other fields to test the classifier against shuffled labels, but sleep data is so autocorrelated that that makes little sense. It might be cool to test against a very naive model that only knows standard transition probabilities.
I've worked up a
simulate_hypno
function here that generates a hypnogram of arbitrary length. It only needs a transition probability matrix, and while the user could pass one in, the default is to use an open-source one based on 68 overnights available from Metzner et al., 2021, Sleep as a random walk: a super-statistical analysis of EEG data across sleep stages. The Metzner paper has a better model than what I've implemented, including more statistical priors for a Bayesian model. That could also be implemented (the code is also online and it's a short single file). I think that would be nice to add, maybe with amodel="markov" or "bayesian"
argument added to the currentsimulate_hypno
function.What do you think? I could see (a) adding the naive markov simulator in after a little more work and then hanging on another PR for the bayesian one, (b) waiting to implement both simultaneously, or (c) implement neither. I prefer (a) because I don't have plans to work on the bayesian one anytime soon, but am cool with whatever.
Simple case of getting a quick hypno.
Basing the fake hypno off a subject's known sleep transitions.