Final Theta of cell state

RoyEHanna commented 5 months ago

Hy,

can we output the final theta for the cell state?

If not, is it better to use first theta for the cell state or run the Bayesprism again with cell state as cell type?

Best ragards. Roy

tinyi commented 4 months ago

Hi Roy,

Sorry for the delay.

The short answer is the final theta was designed to be at the target granularity. Cell states were used to model cells of similar transcription state and also co-occur at a similar ratio across bulk data, and BayesPrism was developed to sum up (marginalize) across these states.

Best,

Tinyi

On Wed, May 1, 2024 at 10:26 PM RoyEHanna @.***> wrote:

Hy,

can we output the final theta for the cell state?

If not, is it better to use first theta for the cell state or run the Bayesprism again with cell state as cell type?

Best ragards. Roy

— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/82, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHS6ZFHCDAYACXNFLDS3ZAFMXRAVCNFSM6AAAAABHCS2EDOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TIMRSGU3DEMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jnhv commented 2 months ago

Hi, Tinyi. Thank you for your excellent work and timely replies. I've searched and checked some issues that might answer my question, but they didn't resolve it. I need your help. @tinyi

Similar to Roy's question, if I want to calculate the proportion of every cell state, rather than cell type, for example, the type is "tumor" and the states are "tumor1", "tumor2", "tumor3", ..., should I run BayesPrism with cell state as cell type? In this situation, could I set the cell.state.labelsas NULL?

Best regards，

tinyi commented 2 months ago

Hi jh,

Thank you for your question.

The answer depends on the biological assumption about your data. For normal cells, since there is usually the corresponding cell state, for example Th1 CD4+ T cell, in the bulk data, the definition of cell type and cell state can be interchangeable depending on the granularity of the deconvolution, i.e. it makes sense to derive the proportion of Th1 CD4+ T cell in bulk data. However, for tumor states, due to the heterogeneity in tumor expression we usually would like to approximate the un-observed expression from tumor cells in each bulk using a combination of tumor states observed in scRNA-seq, and perform an updated cell type fraction by assuming each bulk has a unique tumor expression profile.

The results may become difficult to interpret if you set each tumor state, e.g. tumor1, tumor2,..., as individual cell type. Note that you need to specify a tumor cell type using the "key" argument to tell BayesPrism which cell type it needs to model a sample-specific expression. Currently it only allows the specification of only one cell type or NULL (in which case no sample-specific expression is modeled). I believe it does not make sense for you to specify a single tumor cell type, say tumor1, while ignoring others. But if you specify key=NULL, no sample-specific tumor expression will be modeled, and you would be assuming that every tumor bulk sample is a linear combination of multiple tumor states, which is a strong assumption that may underestimate the heterogeneity in the tumor-specific expression leading to an underestimate of the tumor cell fraction. However, if you do choose to model your data under this assumption, a better way is to use the embedding learning module (please also refer to the BayesPrism paper for details) using your tumor states as prior for the tumor gene programs (which is done after you perform the deconvolution). By doing so, you would start with more accurate tumor cell fractions and update your input gene program during inference and hence yield more accurate estimates of tumor states % (or tumor gene programs %) . Alternatively, if we want to fix the cell states without performing any update using the bulk data, you may simply extract the cell state fraction theta of tumor1, tumor2,..., from the estimates derived from the initial Gibbs sampling.

Hope this helps.

Best,

Tinyi

On Fri, Jul 19, 2024 at 3:44 AM jh @.***> wrote:

Hi, Tinyi. Thank you for your excellent work and timely replies. I've searched and checked some issues that might answer my question, but they didn't resolve it. I need your help.

Similar to Roy's question https://github.com/Danko-Lab/BayesPrism/issues/82#issue-2274225623, if I want to calculate the proportion of every cell state, rather than cell type, for example, the type is "tumor" and the states are "tumor1", "tumor2", "tumor3", ..., should I run BayesPrism with cell state as cell type? In this situation, could I set the cell.state.labels as NULL?

Best regards，

— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/82#issuecomment-2238569815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHS7HIXZQQO3OGBYJSKDZNC7T7AVCNFSM6AAAAABHCS2EDOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZYGU3DSOBRGU . You are receiving this because you commented.Message ID: @.***>

Danko-Lab / BayesPrism

Final Theta of cell state #82