ixxmu / mp_duty

抓取网络文章到github issues保存
https://archives.duty-machine.now.sh/
120 stars 30 forks source link

根据肿瘤突变信息文件计算一系列指标 #1275

Closed ixxmu closed 3 years ago

ixxmu commented 3 years ago

https://mp.weixin.qq.com/s/B1l_-1nsnVmW-uHoqD7Dow

github-actions[bot] commented 3 years ago

根据肿瘤突变信息文件计算一系列指标 by 生信技能树

文献《Multi-Omics Profiling Reveals Distinct Microenvironment Characterization and Suggests Immune Escape Mechanisms of Triple-Negative Breast Cancer》,研究者把TNBC根据免疫分成3个亚群,然后寻找Potential intrinsic immune escape mechanisms of TNBC,这个过程应用了很多突变位点的量化指标,包括:

  • neoantigens,
  • cancer testis antigens (CTAs),
  • homologous recombination deficiency (HRD) scores,
  • intratumoral heterogeneity (ITH)
  • TMB

结果如下;

 

Comparison of mutation loads (A), neoantigen load (B), HRD scores (C), CTA numbers (D), necrosis (E), and ITH scores (F) among the three clusters. In the violin plots, the mean values are plotted as red dots, and the boxplot was drawn inside the violin plot.

计算方法都在附件:https://clincancerres.aacrjournals.org/content/suppl/2019/03/05/1078-0432.CCR-18-3524.DC1

我摘抄了这个英文描述,相信绝大部分人看着都会两眼摸黑:

Calculation of neoantigens

With the WES data (.bam) of paired normal samples from TNBC patients, we first used POLYSOLVER tool (8) to infer the 4-digit HLA genotype for each sample (arguments: Asian 1 hg19 STDFQ 0). Then, neoantigens were predicted using NetMHCpan (v4.0) (9), with the somatic mutation data (.maf) and HLA genotype data as the inputs. Neoantigens derived from protein coding single nucleotide variants (SNV) (Variant_Classification = “Missense_Mutation”, and Variant_Type = ‘‘SNP”) and small insertions and deletions (Indel) (Variant_Classification = “Frame_Shift_Ins’’, ‘‘Frame_Shift_Del’’, ‘‘In_Frame_Ins’’, ‘‘In_Frame_Del’’, and Variant_Type = ‘‘INS”, “DEL”) were predicted separately. Mutations which were predicted to produce peptide with affinity < 500 nM and of which the corresponding gene was expressed greater than Combat value 1 (evaluated based on median expression rather than the specific sample) were chosen as neoantigens. We referred to pVAC-seq (10) and made some modifications based on the features of our dataset to construct this algorithm.

Calculation of cancer testis antigens (CTA)

The CTDatabase (http://www.cta.lncc.br/) was first queried for CTAs. We then calculated the difference in each candidate CTA between the tumor site and the paired normal site; genes whose expression were at least four times higher in the tumor site than the paired normal tissue in at least one patient were selected as TNBC-specific CTAs. In all, a total of 177 CTAs were included in our study. The CTA landscape of TNBC is described in Supplementary Figure 8.

Calculation of homologous recombination deficiency (HRD) scores

The HRD score was calculated as the sum of three scores: allelic imbalance extending to the telomere (NtAI) score, loss of heterozygosity (LOH) score and modified large-scale state transition (LSTm) score. The calculation of these scores was previously described (11). Briefly, the NtAI score was defined as the number of regions with allelic imbalance longer than 11 Mb and extending to one of the subtelomeres but do not crossing the centromere. The LOH score was defined as the number of LOH regions longer than 15 Mb but shorter than the whole chromosome. The LST score was defined as the number of break points between regions longer than 10 Mb after filtering out regions shorter than 3 Mb. In order to diminish effect of ploidy, The LST score was modified using the following formula: LSTm = LST – kP, where P is ploidy, and k is a constant of 15.5.

Estimation of intratumoral heterogeneity (ITH)

ASCAT (12) was used to integrate the copy number data with the data on somatic mutations to estimate the purity and ploidy of each tumor using default parameters. A modified PyClone workflow (13) was then used to estimate the cancer cell fractions of each sample. The fraction of subclonal cancer cells was set as indicators representing the ITH.

那么有没有捷径学这些方法呢

当然是有的,在华大和诺禾都工作过了的十多年生信项目经验的讲师手把手小班教学,你值得拥有, 《肿瘤基因组生物信息学培训班2021年唯一场次》同样的名额有限,理论上很快就招满了!

报名方式

因为本课程是肿瘤信息学专项数据分析,所以不会像之前的《基因组组装》课程那样花两个月时间铺垫Linux基础知识和python知识,也不会像生信技能树的《生信入门》课程那样集中火力于计算机基础知识的打磨,包括基于R语言的统计可视化,以及基于Linux的NGS数据处理

所以,理论上报名本肿瘤信息学实战学习班需要自行加强这些基础能力,请谨慎报名。当然了,如果你仅仅是有肿瘤相关课题合作,想对这方面数据分析有一个概念而并不需要自己实际动手分析,那么也非常欢迎你报名哈,课程绝对会让你物超所值!还等什么呢,赶快扫描下面二维码添加微信报名吧!

(添加好友务必备注 高校或者工作单位+姓名+肿瘤,方便后续认识)

因为是特价抢购,所以在本公众号推文下面赞赏一定数量金额有助于快速通过报名哦。

文末友情推荐

要想真正入门生物信息学建议务必购买全套书籍,一点一滴攻克计算机基础知识,书单在:什么,生信入门全套书籍仅需160 。如果大家没有时间自行慢慢摸索着学习,可以考虑我们生信技能树官方举办的学习班。如果你课题涉及到转录组,欢迎添加一对一客服详见:你还在花三五万做一个单细胞转录组吗?

号外:生信技能树知识整理实习生招募,长期招募,也可以简单参与软件测评笔记撰写,开启你的分享人生!另外,:绝大部分生信技能树粉丝都没有机会加我微信,已经多次满了5000好友,所以我开通了一个微信好友,前100名添加我,仅需150元即可,3折优惠期机会不容错过哈。我的微信小号二维码在:0元,10小时教学视频直播《跟着百度李彦宏学习肿瘤基因组测序数据分析》