Clarification of tumor subtype strategy

t-carroll commented 3 years ago

Hi there,

In the FAQ there is the following line:

To obtain the best possible representation for the unobserved tumor expression in bulk RNA-seq, we recommend users to cluster the scRNA-seq data of tumor cells in individual patients, and then mark them using cell.subtype.labels.

I just want to make sure I understood what "them" is referring to in that line. I think it means to mark each specific tumor subcluster with its own unique cell.subtype.label? Or is it trying to say to mark all malignant epithelium from a given patient with the same patient-specific cell.subtype.label, regardless of how many distinct subclusters there are?

For instance, take an scRNA-seq dataset where for some patients, the tumor seems to be comprised of multiple subclones, which appear as distinct subclusters (e.g. patient A in Figure 1B here). What would be the appropriate cell.subtype labelling strategy for BayesPrism- to label the malignant cells by patient ID only, or to assign each distinct malignant subcluster its own cell.subtype.label? Any input would be much appreciated, thanks!

tinyi commented 3 years ago

Hi Tom,

Either way can be a reasonable approach. The choice of cell.subtype.label depends on the extent of intra-tumoral heterogeneity. In cases where there is substantial intra-tumoral heterogeneity, i.e. the distribution of tumor cells in each patient deviates significantly from a multinomial distribution, it is more appropriate to define each subclone in each patient as a unique label in cell.subtype.label. For example, you can label them as Patient1-clone1, Patient1-clone2, Patient2-clone1, ...

Best,

Tinyi

t-carroll commented 3 years ago

Hi Tinyi, Appreciate the quick response! That all makes sense, thank you for the further info on this.

All the best, Tom

Danko-Lab / TED

Clarification of tumor subtype strategy #11