Fine tuning HMM - Githubissues

saltwater-tensor commented 4 months ago

Hi all,

When one is fine-tuning the HMM, the fine-tuning function says it is adjusting the observation model and estimating alphas.

So then, does it dramatically alter the original states of the HMM?

Will the spectral content that a group level state is capturing completely change after fine tuning?

For example, group level state 1: Shows strong frontal delta power Will individual level state 1 start showing completely different spatial and spectral characteristics?

Then how are we supposed to define a "state"?

And what exactly are the parameters that are trained when dual estimation is run? For example which layers' weights are set to trainable in TensorFlow? Are the global HMM state covariances not used at all?

Any comments or interpretations will be helpful.

cgohil8 commented 4 months ago

So then, does it dramatically alter the original states of the HMM?

This would depend on your data. In most scenarios the fact that you initialise to the group-level means/covariances/alpha would mean that you'll converge to a local optimum close to the group description when you re-estimate for a particular subject. However, it is possible to move away if the data is very different from the group but I think this would be unlikely.

For example, group level state 1: Shows strong frontal delta power Will individual level state 1 start showing completely different spatial and spectral characteristics?

I would be surprised if this happens, it's something you would have to try and see.

Then how are we supposed to define a "state"?

If you fine tune on a particular subject, you're learning a set of states for that particular subject.

And what exactly are the parameters that are trained when dual estimation is run?

Dual estimation is the more common approach. Here, only the the state means and covariances are learnt holding the state probabilities (alpha) fixed from the group-level description. Here, the states are shared across all subjects, because you're fitting the group-level alphas to a subject. You're just learning the subject-specific observation model (means, covariances). This is the recommended approach for obtains subject-specific features.

For example which layers' weights are set to trainable in TensorFlow?

For the HMM:

Dual estimation; transition probability matrix is fixed to the group-level estimate, you learn the subject/state-specific means and covariances.
Fine tuning: the transition probability matrix is learnt for a specific subjects (which impacts your alphas for the subject), and you learn the subject/state-specific means covariances.

Are the global HMM state covariances not used at all?

Neither fine tuning or dual estimation use the group-level means/covariances directly.

saltwater-tensor commented 3 months ago

Thank you for your comments.

Would the following understanding be correct then:

Estimate a group level model.

Now let us say we have a new time series. We will prepare this new time series exactly the way the original dataset was produced.

Step 1: Calculate the alphas using the group-level model. Step 2: Perform dual estimation for this time series using the group-level model and the alphas obtained in step 1.

Repeat for different time series under various experimental conditions.

Then we can say that the states are shared across different experimental conditions and time series. And that the observation model is adjusted for each time series.

Also, during dual estimation, does the initialization use the group-level model?

Thanks for your help.

cgohil8 commented 3 months ago

Your understanding description is correct. The steps you described is what we would typically do and is the current recommended approach.

Note:

We will prepare this new time series exactly the way the original dataset was produced.

If you do PCA then you should apply the PCA components from the training dataset to your new dataset when you prepare it (rather than calculating a new PCA on the new dataset independently).

Also, during dual estimation, does the initialization use the group-level model?

It does provided you use the model after training it (calling fit) or have loaded a trained model. I.e. as long as you have not built a new model from scratch. If you're unsure, you can call model.get_means_covariances() and verify that the group-level estimates are returned - then you know it'll use the group-level means/covariances in calculating the alphas for your new data.

saltwater-tensor commented 3 months ago

In all the examples provided in the toolbox, there are instances where dual estimation is performed.

But effectively, the subject- or session-specific covariances and means are never used.

Furthermore, the temporal properties of HMM states as well as the spectral properties estimated using multitaper are purely dependent upon the alphas.

And since the alphas during dual estimation are estimated using the group-level model, I don't see the point of even performing dual estimation.

Or have I misunderstood some aspect of dual estimation?

Are the subject- or session-specific covariances somehow influencing the temporal and spectral properties during dual estimation? As far as I can see, they are not.

In fine tuning, clearly, the subject-specific tuned model will change alpha, covariances and means. There, I can see how it will affect the final properties one would calculate post-hoc.

How have the dual estimates of covariance and mean been used before?

cgohil8 commented 3 months ago

But effectively, the subject- or session-specific covariances and means are never used.

Furthermore, the temporal properties of HMM states as well as the spectral properties estimated using multitaper are purely dependent upon the alphas.

And since the alphas during dual estimation are estimated using the group-level model, I don't see the point of even performing dual estimation.

These are included as example code in case the user requires them. In fMRI these are all that's available. In MEG, the multitaper is a dual estimation, which offers higher temporal resolution so is generally preferred.

Are the subject- or session-specific covariances somehow influencing the temporal and spectral properties during dual estimation? As far as I can see, they are not.

They are not.

In fine tuning, clearly, the subject-specific tuned model will change alpha, covariances and means. There, I can see how it will affect the final properties one would calculate post-hoc.

Yeah, the fine-tuned alphas could be used instead of the group-level alphas provided they have no deviated too much that you still get a good correspondence between subjects. Knowing the observation model is shared between subjects would be the reason to use the group-level alphas.

How have the dual estimates of covariance and mean been used before?

Normally for downstream tasks such as decoding:

https://github.com/OHBA-analysis/osl-dynamics/tree/main/examples/decoding/hmm.

saltwater-tensor commented 3 months ago

Thanks a lot for your quick responses.

OHBA-analysis / osl-dynamics

Fine tuning HMM #269