抓取网络文章到github issues保存
3种EMT打分算法 #766

3种EMT打分算法 by 生信技能树

看到一个预印本文章对3种EMT打分算法进行了测评,挺有意思的,标题是:《Comparative study of transcriptomics-based scoring metrics for the epithelial-hybrid-mesenchymal spectrum》,链接在 https://www.biorxiv.org/content/10.1101/2020.01.02.892604v1.full



加权线性加和(76-gene EMT signature)

这个算法对CDH1 (E-cadherin)的权重很高,所以epithelial的EMT scores是比mesenchymal高。

参考文献:An epithelial-mesenchymal transition gene signature predicts resistance to EGFR and PI3K inhibitors and identifies Axl as a therapeutic target for overcoming EGFR inhibitor resistance. Clin. Cancer Res.

two-sample Kolmogorov-Smirnov test

This score varies on a scale of −1 to 1, with the higher scores corresponding to more mesenchymal samples (Tan et al., 2014).

参考文献:Epithelial-mesenchymal transition spectrum quantification and its efficacy in deciphering survival and drug responses of cancer patients. EMBO Mol. Med.

multinomial logistic regression

它的值集中于0到2之间, This method particularly focuses on characterizing a hybrid E/M phenotype using the expression levels of 23 genes – 3 predictors and 20 normalizers – identified through NCI-60 gene expression data.

会把样品判断为3种状态:epithelial, mesenchymal, or hybrid E/M categories

参考文献是:2017). Survival outcomes in cancer patients predicted by a partial EMT gene expression scoring metric. Cancer Res. 77, 6415–6428. doi:10.1158/0008-5472.CAN-16-3521.


  • GSVA生存分析教学视频 https://www.bilibili.com/video/av81874183
  • https://mp.weixin.qq.com/s/LJjsdf3X66nJ1KmpvvkHOA


比如发表于2020年1月的文章;《Gene signatures of tumor inflammation and epithelial-to-mesenchymal transition (EMT) predict responses to immune checkpoint blockade in lung cancer with high accuracy》,链接是:https://www.sciencedirect.com/science/article/pii/S0169500219306932

  • There is not yet a validated lung cancer EMT signature, so we prospectively generated a gene list based on previous publications describing “classic” EMT genes in cancer [23,24] with the addition of some genes specifically mentioned in studies evaluating EMT in NSCLC [27–29].
  • Selected genes had levels of expression that were clearly above baseline (using a cutoff of 10 reads in our data set). Supplemental Table 2 shows the list of genes included along with their average expression level.
  • Although the genes SNAI1, TWIST1, TWIST2, CDH2, and ZEB1 are classic mesenchymal markers, their expression levels were very low in our dataset and thus not included.
  • Based on these criteria, we generated the EMT signature by adding the sum of the log2 Z scores of 6 established mesenchymal genes (AGER, FN1, MMP2, SNAI2, VIM, ZEB2) and subtracting the sum of the log2 Z scores of 6 established epithelial genes (CDH1, CDH3, CLDN4, EPCAM, MAL2, and ST14) (Supplemental Table 2).
  • In this signature, the most mesenchymal tumors have the most positive EMT scores and the most epithelial tumors have the most negative scores.

总结起来,其实超级简单,就是选取高表达量的EMT基因,然后6个mesenchymal基因 的log2 Z scores 值的和,减去6个epithelial 基因 的log2 Z scores 值的和,所以这个EMT 打分越高就说明它是mesenchymal 的。

EMT 背景知识

EMT,全称epithelial–mesenchymal transition,又称翻译为上皮间质转换,指的是上皮细胞在一些因素的作用下,失去极性及细胞间紧密连接和黏附连接, 获得了浸润性和游走迁移能力, 变成具备间质细胞形态和特性的细胞的改变。这种行为是可逆的。

EMT的概念最早是1982年由Green-berg和Hay提出。然而,长久以来,科学界对于EMT在肿瘤转移过程中的作用一直存在争议,主要是因为无法在体内观察EMT过程 。

胚胎发育与癌症发展中的细胞可塑性变化有着惊人的相似性,**而这种可塑性变化受到上皮间质转化epithelial-mesenchymal transition (EMT)过程的调节。**胚胎发育时期,上皮状态和间充质状态的细胞能够自由转化。

  • 上皮间质转化(EMT)使得细胞具备转移和浸润特性。
  • 其反向过程,间质上皮转化mesenchymal-epithelialtransition (MET)赋予了细胞极性变化并失去移动能力。



一、 首先介绍下,EMT发生常见的标志分子。

1.表达减少:E- cadherin,Cytokeratin,ZO-1。

2.表达增多:N- cadherin,Vinmentin,Snail1,Snail2,Twi  st,MMP-2,MMP-3,MMP-9。








三. EMT过程有着复杂的调节网络:






这些调节方式多样性使得EMT的调节往往不是线性的。此前,EMT调节过程中关于E-cadherin的转录,EMT转录因子SNAI1,SNAI2, ZEB1,ZEB1和 TWIST1等研究较多。


与普通血细胞相比,CTCs具有某些特殊的生物标志物,包括CTCs表面的上皮细胞粘附分子(Epithelial cell adhesion molecule,EpCAM)


  • 在咽癌中Ep-CAM高表达与淋巴结转移密切相关,Ep-CAM表达越高,淋巴结转移发生频率越高
  • 在头颈部鳞状细胞癌中,Ep-CAM在转移灶中表达比原发灶更高
  • 在胃癌骨髓转移灶中的播散性肿瘤细胞Ep-CAM的表达较原发灶降低

免疫组织化学中的细胞角蛋白(Cytokeratin, CK) :


简单搜索了一下, 确实是热点:

  • 早在2010的PANS文章https://doi.org/10.1073/pnas.1004900107 就定义过 EMT Core Signature 基因集合We identified an EMT core signature consisting of 159 genes that were down-regulated and 87 genes that were up-regulated at least 2-fold by all of these EMT-inducing signals (Table S1). 数据在 GSE9691 和 GSE9691
  • 然后是 2012 的 Meta-Analysis of Gene Expression  https://doi.org/10.1371/journal.pone.0051136 也发布了基因集,包括 EMT-core gene list of 130 up- or downregulated genes shared between at least 10 GES datasets.和 List of 365 genes significantly regulated in at least 10 GES datasets.
  • 也可以在 Pan‐cancer genomic datasets from The Cancer Genome Atlas (TCGA), representing over 10,000 patients and 32 distinct cancer types, provide a rich resource for examining correlative patterns involving EMT mediators in the setting of human cancers.验证:https://onlinelibrary.wiley.com/doi/pdf/10.1002/dvdy.24485
  • 2015 数据库 dbEMT   但是引用率不高, - ‎被引用次数:28  其整理的 All the 377 human Epithelial-Mesenchymal Transition genes with cancer types 数据是可以下载的:http://dbemt.bioinfo-minzhao.org/download.cgi
  • 最后是 2020的这个最新了,The web-based EMTome portal is a resource for primary and metastatic tumour research publicly available at www.emtome.org. 文章链接是:https://www.nature.com/articles/s41416-020-01178-9