IMB-Computational-Genomics-Lab / ascend

R package - Analysis of Single Cell Expression, Normalisation and Differential expression (ascend)
22 stars 7 forks source link

NewAEMSet requires the genes matrix to be in order 1: ensembl_id then 2: gene_name #2

Closed quanaibn closed 7 years ago

quanaibn commented 7 years ago

if changing the order, there is a error: Error in $<-.data.frame(*tmp*, "control", value = TRUE) : replacement has 1 row, data has 0 I think this can be solved by calling columns by names?

asenabouth commented 7 years ago

Could you please post a snippet to show how you came across the error? Thanks

quanaibn commented 7 years ago

exprs_mat <- read.csv(paste0(path, 'mmThymus_scRNA_Aggr_ExpressionMatrix_V4.csv'), header=T)

the exprs_mat has 3 batches, with suffix .1 to .3; and has a gene_id column with gene symbol and another row.names column with Ensembl ID

exprs_matrix <- exprs_mat[,3:ncol(exprs_mat)] barcodes <- colnames(exprs_matrix) batch.information <- lapply(strsplit(barcodes, '\.'), [, 2) barcodes <-as.data.frame(barcodes) colnames(barcodes) <- c("cell_barcode") barcodes$batch <- as.numeric(batch.information)

genes <- exprs_mat[,1:2] colnames(genes) <- c("ensembl_id", "gene_name") gene.names <- make.unique(as.vector(genes$gene_name)) rownames(exprs_matrix) <- gene.names genes$gene_name <- gene.names

mito.genes <- rownames(exprs_matrix)[grep("^Mt-", rownames(exprs_matrix), ignore.case = TRUE)] ribo.genes <- rownames(exprs_matrix)[grep("^Rps|^Rpl", rownames(exprs_matrix), ignore.case = TRUE)] control.list <- list(Mt = mito.genes, Rb = ribo.genes)

expression.matrix <- exprs_matrix

expression.matrix <-exprs_matrix gene.information <- genes[,c(2,1)] cell.information <- barcodes

aem.set <- NewAEMSet(ExpressionMatrix = expression.matrix, CellInformation = cell.information, GeneInformation = gene.information, Controls = control.list)

The NewAEMSet error comes from here: gene.information <- genes[,c(1,2)]
If i swap it  gene.information <- genes[,c(2,1)], the NewAEMSet run well
asenabouth commented 7 years ago

Please use the inbuilt function ConvertGeneAnnotation to change the annotations. This will update the control names in the object, move the columns around and sync all the slots together.