edroaldo / cellrouter

Reconstruction of complex single-cell trajectories using CellRouter
45 stars 21 forks source link

Whether can I use specific genes instead of PCto construct #14

Open zorrodong opened 6 years ago

zorrodong commented 6 years ago

Hi, Thanks for the powerful tool. I have two questions:

1, whether can I use specific genes instead of PCs to construct the kNN network?

  1. When I used the function findPaths for my data, it generated the following errors, however, it was successful for the myeloid progenitors you provided. Could you please help me ?

Thanks very much.

cellrouter <- findPaths(cellrouter, column='Batch', libdir, paste(getwd(), 're-------------------Transition: B2.B3 ----------------------- 1) Computing flow network Creating flow network... Uncaught exception - null java.lang.NullPointerException at cellrouter.InputOutput.getNetworkData(InputOutput.java:54) at cellrouter.CellRouter.createFlowNetwork(CellRouter.java:83) at cellrouter.CellRouter.runAnalysis(CellRouter.java:63) at cellrouter.CellRouter.main(CellRouter.java:244) -------------------Transition: B2.B1 ----------------------- 1) Computing flow network Creating flow network... Uncaught exception - null java.lang.NullPointerException at cellrouter.InputOutput.getNetworkData(InputOutput.java:54) at cellrouter.CellRouter.createFlowNetwork(CellRouter.java:83) at cellrouter.CellRouter.runAnalysis(CellRouter.java:63) at cellrouter.CellRouter.main(CellRouter.java:244)

edroaldo commented 6 years ago

Hi,

Thanks for trying CellRouter.

Regarding 1: I think I did not undestand your first question. The kNN is performed in "cell space" not at the "gene space". In principle, we can use the parameter genes.use to perform a PCA analysis using selected genes and than build the kNN graph from the PCA analysis generated genes defined in genes.use. You can do this by using, for example:

cellrouter <- computePCA(cellrouter, num.pcs = 30, genes.use=rownames(cellrouter@ndata), seed=42)

where you can change rownames(cellrouter@ndata) to use the gene list you want...

Regarding 2:

Can you send me the script you are using. This usually happens when the file "cell_edge_weighted_network.txt" is not in the right directory. I would recommend use the same directory structure as in the tutorial... Please, send me the lines of code before findPaths, potentially starting at buildKNN, so I can have a better idea of what you are doing...

Thanks!

Em ter, 14 de ago de 2018 às 10:38, zorrodong notifications@github.com escreveu:

Hi, Thanks for the powerful tool. I have two questions:

1, whether can I use specific genes instead of PCs to construct the kNN network?

  1. When I used the function findPaths for my data, it generated the following errors, however, it was successful for the myeloid progenitors you provided. Could you please help me ?

Thanks very much.

cellrouter <- findPaths(cellrouter, column='Batch', libdir, paste(getwd(), 're-------------------Transition: B2.B3 -----------------------

  1. Computing flow network Creating flow network... Uncaught exception - null java.lang.NullPointerException at cellrouter.InputOutput.getNetworkData(InputOutput.java:54) at cellrouter.CellRouter.createFlowNetwork(CellRouter.java:83) at cellrouter.CellRouter.runAnalysis(CellRouter.java:63) at cellrouter.CellRouter.main(CellRouter.java:244) -------------------Transition: B2.B1 -----------------------
  2. Computing flow network Creating flow network... Uncaught exception - null java.lang.NullPointerException at cellrouter.InputOutput.getNetworkData(InputOutput.java:54) at cellrouter.CellRouter.createFlowNetwork(CellRouter.java:83) at cellrouter.CellRouter.runAnalysis(CellRouter.java:63) at cellrouter.CellRouter.main(CellRouter.java:244)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/edroaldo/cellrouter/issues/14#issuecomment-412895015, or mute the thread https://github.com/notifications/unsubscribe-auth/AJqUR8KtZjHTuZBjlrwN9qLfYDj3Pfrvks5uQuDpgaJpZM4V8jta .

-- Edroaldo

zorrodong commented 6 years ago

Thanks for the prompt reply. The second question is solved following your adivece.

However, for the first question, it will generate error if one uses specific genesets. I think it is beacuse of this code: gene.loadings <- pca$v ........ rownames(gene.loadings) <- rownames(object@scale.data).

'cause these two genesets are not the same.

setGeneric("computePCA", function(object, num.pcs, genes.use=NULL, seed=1) standardGeneric("computePCA")) setMethod("computePCA", signature="CellRouter", definition=function(object, num.pcs, genes.use, seed){

computePCA <- function(data, num.pcs, seed=7){

        library(irlba)
        if (!is.null(seed)) {
          set.seed(seed = seed)
        }
        if(is.null(genes.use)){
          #genes.use <- object@var.genes
          genes.use <- rownames(object@ndata)
        }
        pca <- irlba(A = t(object@scale.data[genes.use,]), nv = num.pcs)
        gene.loadings <- pca$v
        #cell.embeddings <- pca$u
        cell.embeddings <- pca$u %*% diag(pca$d)
        sdev <- pca$d/sqrt(max(1, ncol(data) - 1))
        rownames(gene.loadings) <- rownames(object@scale.data)
        colnames(gene.loadings) <- paste0('PC', 1:num.pcs)
        rownames(cell.embeddings) <- colnames(object@scale.data)
        colnames(cell.embeddings) <- colnames(gene.loadings)

        object@pca <- list(gene.loadings = gene.loadings, cell.embeddings=cell.embeddings, sdev=sdev)
        object@rdimension <- as.data.frame(object@pca$cell.embeddings)

        object
      }

)

edroaldo commented 6 years ago

Can you please send me the few lines of code up to computePCA such that I can have some context on how you are doing this? If I remember correctly, I did performed PCA with computePCA using selected gene sets sometime ago... Please, send me relevant lines of your script... it makes easier to debug...

Thanks!

Em qua, 15 de ago de 2018 às 03:58, zorrodong notifications@github.com escreveu:

Thanks for the prompt reply. The second question is solved following your adivece.

However, for the first question, it will generate error if one uses specific genesets. I think it is beacuse of this code: gene.loadings <- pca$v ........ rownames(gene.loadings) <- rownames(object@scale.data).

'cause these two genesets are not the same.

setGeneric("computePCA", function(object, num.pcs, genes.use=NULL, seed=1) standardGeneric("computePCA")) setMethod("computePCA", signature="CellRouter", definition=function(object, num.pcs, genes.use, seed){

computePCA <- function(data, num.pcs, seed=7){

library(irlba) if (!is.null(seed)) { set.seed(seed = seed) } if(is.null(genes.use)){

genes.use <- object@var.genes

genes.use <- rownames(object@ndata) } pca <- irlba(A = t(object@scale.data[genes.use,]), nv = num.pcs) gene.loadings <- pca$v

cell.embeddings <- pca$u

cell.embeddings <- pca$u %*% diag(pca$d) sdev <- pca$d/sqrt(max(1, ncol(data) - 1)) rownames(gene.loadings) <- rownames(object@scale.data) colnames(gene.loadings) <- paste0('PC', 1:num.pcs) rownames(cell.embeddings) <- colnames(object@scale.data) colnames(cell.embeddings) <- colnames(gene.loadings)

    object@pca <- list(gene.loadings = gene.loadings, cell.embeddings=cell.embeddings, sdev=sdev)
    object@rdimension <- as.data.frame(object@pca$cell.embeddings)

    object
  }

)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/edroaldo/cellrouter/issues/14#issuecomment-413123181, or mute the thread https://github.com/notifications/unsubscribe-auth/AJqUR-eU3D5MVWAiPwKp22tueB0tyRG3ks5uQ9SegaJpZM4V8jta .

-- Edroaldo

zorrodong commented 6 years ago

source('/datb1/dongji/project/Wuxinglong/CellRouter/cellrouter-master/CellRouter_Class.R') library('dplyr')

load("/datb1/dongji/project/Wuxinglong/CellRouter/GAS31/GAS_TPM.RData")###TPM data sample <- read.table("GAS29.xls",header=T,row.names = 1)

GAS_sample <- GAS_TPM[,as.character(rownames(sample))] GAS_expr <- log2(GAS_sample/10 + 1)

GAS_expr <- GAS_expr[rowSums(GAS_expr>1)>=3,] cellrouter <- CellRouter(GAS_expr)

cellrouter <- addInfo(cellrouter, metadata = sample, colname = 'Batch', metadata.column='Batch')

cellrouter <- Normalize(cellrouter)

cellrouter <- scaleData(cellrouter)

marker <- as.character(read.table("gene.txt")$V1) marker_use <- marker[marker %in% rownames(GAS_expr)]

cellrouter <- computePCA(cellrouter, genes.use = marker_use, num.pcs = 30, seed=42) ###This will generate the error

zorrodong commented 6 years ago

The error is shown below:

cellrouter <- computePCA(cellrouter,

edroaldo commented 6 years ago

Got it. The PCA analysis is performed using the scaled data. Can you please try: cellrouter <- scaleData(cellrouter, genes.use=marker_use) cellrouter <- computePCA(cellrouter, genes.use = marker_use, num.pcs = 30, seed=42)

I will add some checks to avoid this issue in the future and release with a new version of CellRouter. I apologize. Let me know if that works...

Thanks!

Em qua, 15 de ago de 2018 às 09:29, zorrodong notifications@github.com escreveu:

The error is shown below:

cellrouter <- computePCA(cellrouter,

-

                    genes.use = marker_use,

-

                    num.pcs = 30, seed=42)

Error in rownames<-(tmp, value = c("A1BG", "A1CF", "A2M", "A2M-AS1", : length of 'dimnames' [1] not equal to array extent

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/edroaldo/cellrouter/issues/14#issuecomment-413197527, or mute the thread https://github.com/notifications/unsubscribe-auth/AJqUR05PEwhKq8TN8D1xSz2-gsk7MHkRks5uRCJBgaJpZM4V8jta .

-- Edroaldo

zorrodong commented 6 years ago

That is great, I think it will work.

Before, I used the specific genes to perform the PCA, and then use this PCA to replace the PCA of whole transcriptome data.

Your solution is much better. Thanks very much!

Best,

Zorro

edroaldo commented 6 years ago

Cool! Keep me posted!

Thanks much!

Em qua, 15 de ago de 2018 às 09:52, zorrodong notifications@github.com escreveu:

That is great, I think it will work.

Before, I used the specific genes to perform the PCA, and then use this PCA to replace the PCA of whole transcriptome data.

Your solution is much better. Thanks very much!

Best,

Zorro

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/edroaldo/cellrouter/issues/14#issuecomment-413203902, or mute the thread https://github.com/notifications/unsubscribe-auth/AJqURzFgJSRU6zjpv35MtXLz3D3hWEx2ks5uRCePgaJpZM4V8jta .

-- Edroaldo