hesim-dev / hesim

Health economic simulation modeling and decision analysis
https://hesim-dev.github.io/hesim/
62 stars 15 forks source link

Understanding the Model Fitting Process in Hesim #104

Closed swaheera closed 1 year ago

swaheera commented 1 year ago

Hello Dr. Incerti,

I was reading about your R package 'Hesim' and thought it was really interesting!

In particular, I was reading the following link (https://hesim-dev.github.io/hesim/articles/mlogit.html).

In the case of using multinomial logistic regression models for estimating transition probabilities of Discrete Markov Chains - Procedurally, I was trying to understand how this works.

Suppose there is a dataset with 3 States (State A, State B, State C). Each row in this dataset is an individual medical patient and contains some information on their covariates (e.g. height, weight, blood pressure, etc.), the state they were in and the state they transitioned to - my understanding is the following:

Is my understanding of the above correct?

Your Help Is Greatly Appreciated, Thanks, S

dincerti commented 1 year ago

Hi @swaheera , apologies for the delayed response. Everything you wrote above is correct.

swaheera commented 1 year ago

@dincerti: thank you for your reply! I have been sick lately and have not been able to reply :(

I was just wondering - have you heard of the "msm" package in R? Would you say that your approach is similar to such approaches? I am interested in learning about how covariates can be used to model the transition probabilities of Discrete Markov Chains.

Thank you so much!

dincerti commented 1 year ago

@swaheera the msm package is great. We should how to use it to parameterize a model simulated with hesim in our preprint (see section 4.3 here).

swaheera commented 1 year ago

@dincerti : Thank you for your reply! I think there work you are doing is really great - a lot of people in the world are working with data that could really benefit from these kinds of models, but the documentation/software is not available or far too technical for the average user.

As an example, here is a problem that I am working on:

Suppose there is a system in which individuals are measured only at discrete time points (e.g. blood pressure, weight, is measured once every year) and these individuals have the ability to transition back-and-forth between multiple states (e.g. disease free, disease stage 1, disease stage 2, death by disease of interest , death by comorbidity, did not show up to the hospital that year, etc.) until one of the absorbing states is reached or the period of study is over. The patients also have the ability to re-enter the system if they have been absent for some time. The goal of the analysis is to understand what how different cohorts of patients and patient characteristics contribute to the transitions between these states.

I am trying to understand what types of Models are generally used for this type of problem.

At first I thought that perhaps the competing risk model (with time-varying covariates) might be suitable seeing as there are "competing absorption states" (e.g. death by disease of interest vs death by comorbidity) - but I am sure if R packages like "discSurv" currently allow for this. This approach typically does not allow for re-entry and assumes all non-initial states are absorbing. I thought perhaps I could modify the research question and only study transitions for patients who begin the study in the healthy state and see which absorbing state they eventually end up in.

I also thought of simply using the MSM R package (i.e. something similar to Cox-PH) and assume that my discrete times are continuous - but I am not sure if this is a good idea. For example, does the concept of a Q(t) rate matrix make sense when you have a discrete time Markov Chain? This will likely produce a "stepwise" hazard function - but I am not sure if this is mathematically logical.

Another approach that I have been considering is using several multinomial logistic regression (e.g. as described in the hesim R package). For example, if there are "n" states and "k" absorbing states (i.e. "n - k" non-absorbing states):

Thus, as a recap:

I think the approach described within "hesim" (Approach 3) is a suitable approach for my problem - do you think this might be reasonable?

Thank you so much for everything!

Note: I had an idea to create an additional time covariate of "total time spent in system" and include this in the multinomial logistic regression (alongside "time spent in a specific state") - but I am not sure if doing this will violate some assumptions.