Closed ixxmu closed 2 years ago
关于这一部分我在前面介绍TIP数据库分析癌症免疫反应周期和免疫细胞浸润时候有介绍。TIP数据库把TCGA中33种癌症类型的免疫反应周期中各个样本的癌症免疫周期步骤的评分已经算好了,直接下载就行,那非TCGA数据库肿瘤转录组数据如何计算呢?文章作者提供了R代码。
R代码地址:https://github.com/dengchunyu/TIP
这个代码不是标准的R包。
下载后的代码,解压后如下:
有测试的代码
data <- read.table(sprintf("%s/%s","./Test", "RNA-seq_tpm_example_5.txt"),
sep = "\t", stringsAsFactors = FALSE, header =TRUE,
check.names=F, na.strings = NULL, row.names = 1)
source("./1.MainFunction/TIP_integration.R")
test <- TIP_integration(
codePath = ".",
filePath = "./Test",
fileName = "RNA-seq_tpm_example_5.txt",
saveDir = "./Test",
sampleNumber = 5,
permTimes = 100,
type.of.data = "RNA-seq",
format.of.file = "TPM",
sample = "multiple",
CancerType = "GBM",
Samples=paste(colnames(data), collapse = "\t"), email="")
data就是TPM表达数据
跟着运行就行了。运行结果会保存在Test文件夹下。
如果你计算自己的数据,注意文件夹路径。
下面我以BLCA的数据为例计算。
data <- read.table("./BLCA/TCGA-BLCA_RNASeq_TPM.txt",
sep = "\t", stringsAsFactors = FALSE, header =TRUE,
check.names=F, na.strings = NULL)
head(data)[,1:3]
source("./1.MainFunction/TIP_integration.R")
dim(data)
BLCA <- TIP_integration(
codePath = ".",
filePath = "./BLCA",
fileName = "TCGA-BLCA_RNASeq_TPM.txt",
saveDir = "./BLCA",
sampleNumber = 433,
permTimes = 100,
type.of.data = "RNA-seq",
format.of.file = "TPM",
sample = "multiple",
CancerType = "BLCA",
Samples=paste(colnames(expDataTPM), collapse = "\t"), email="")
注意文件路径,修改样本数量sampleNumber。还有,如果你的数据是Count的话,不需要进行转化,它会给你转换成TPM。还有就是样本类型。'Microarray'或者 'RNA-seq',另外,还可以是单细胞的数据。sample可以是'multiple' 或r 'single'。
参数见下面:
#' @description Main function for communication between R and web interfaces.
#' @param codePath Character represting the storage path of import code files and underlying .RData.
#' @param filePath Character represting the storage path of the users expression profile.
#' @param fileName Character representing the file name of expression profile.
#' @param saveDir Character represting the storage path of results.
#' @param sampleNumber Numeric value indicating number of samples in profile.
#' @param permTimes Numeric value indicating times of permutation, default by 100.
#' @param type.of.data Character indicating source of expression data, 'Microarray'or 'RNA-seq'.
#' @param format.of.file Character indicating format of RNA-seq expression data, 'TPM' or 'Count'.
#' @param sample Character indicating sample number, choose 'multiple' or 'single'.
#' @param CancerType The type of cancer selected by the user.
#' @param Samples A string consists all names of samples seperated with tab.
其实,在第一个文件夹和第二个文件夹下面有很多函数和数据,这些函数你可以拿去用的。
不过,这里需要注意一下,本地电脑计算需要的内存很大,可以在服务器上运行,或者分割数据分享后合并。
计算好后,我们就可以评价抗癌免疫反应的过程了,比如我在单细胞教程中的单个基因高低表达分组后,每个反应步骤的激活分数的比较。相关学习可以参考下面文章:
参考文献:
TIP: A Web Server for Resolving Tumor Immunophenotype Profiling
经 典 栏 目
https://mp.weixin.qq.com/s/1MUezfxnIlDAzdMkG9YtHA