digitalcytometry / ecotyper

EcoTyper is a machine learning framework for large-scale identification of cell states and cellular ecosystems from gene expression data.
Other
180 stars 42 forks source link

EcoTyper_recovery_bulk.R #83

Closed chrismahony closed 4 months ago

chrismahony commented 11 months ago

I am trying to run with the follwing parameters:

Rscript EcoTyper_recovery_bulk.R -d 'Carcinoma' -m '/home/data_all_samples.txt' \ -a '/home/Kira_Tue_bulk/annotation_all.txt' -c 'condition' -o '/home/Kira_Tue_bulk/ecotyper'

But I get this error:

Running cell state recovery on: NK.cells... Running cell state recovery on: Mast.cells... Running cell state recovery on: PMNs... Running cell state recovery on: Dendritic.cells... Running cell state recovery on: PCs... Running cell state recovery on: B.cells... Running cell state recovery on: Monocytes.and.Macrophages... Running cell state recovery on: Fibroblasts... Running cell state recovery on: CD4.T.cells... Running cell state recovery on: CD8.T.cells... Error in NMF:::std.divergence.update.h(v, w, h, nbterms = nb, ncterms = nc, : REAL() can only be applied to a 'numeric', not a 'logical' Calls: NMFpredict ... .local -> run -> run -> .local -> updateFun -> In addition: Warning messages: 1: In validityMethod(object) : Dimensions of W and H look strange [ncol(W)= 19 > ncol(H)= 14 ] 2: In max(target, na.rm = TRUE) : no non-missing arguments to max; returning -Inf 3: In validityMethod(object) : Dimensions of W and H look strange [ncol(W)= 19 > ncol(H)= 14 ] 4: In validityMethod(object) : Dimensions of W and H look strange [ncol(W)= 19 > ncol(H)= 14 ] 5: In min(x, na.rm = TRUE) : no non-missing arguments to min; returning Inf 6: In min(rowSums(x, na.rm = TRUE), na.rm = TRUE) : no non-missing arguments to min; returning Inf 7: In validityMethod(object) : Dimensions of W and H look strange [ncol(W)= 19 > ncol(H)= 14 ] Error in NMF:::std.divergence.update.h(v, w, h, nbterms = nb, ncterms = nc, : REAL() can only be applied to a 'numeric', not a 'logical' Calls: NMFpredict ... .local -> run -> run -> .local -> updateFun -> Error in NMF:::std.divergence.update.h(v, w, h, nbterms = nb, ncterms = nc, : REAL() can only be applied to a 'numeric', not a 'logical' Calls: NMFpredict ... .local -> run -> run -> .local -> updateFun -> In addition: Error in NMF:::std.divergence.update.h(v, w, h, nbterms = nb, ncterms = nc, : REAL() can only be applied to a 'numeric', not a 'logical' Calls: NMFpredict ... .local -> run -> run -> .local -> updateFun -> In addition: Error in NMF:::std.divergence.update.h(v, w, h, nbterms = nb, ncterms = nc, : REAL() can only be applied to a 'numeric', not a 'logical' Calls: NMFpredict ... .local -> run -> run -> .local -> updateFun -> Warning messages: In addition: Timing stopped at: 0.014 0.001 0.018 Warning messages: In addition: 1: Warning messages: 1: Warning messages: In max(target, na.rm = TRUE) :1: In max(target, na.rm = TRUE) :1: In max(target, na.rm = TRUE) : In max(target, na.rm = TRUE) : no non-missing arguments to max; returning -Inf

no non-missing arguments to max; returning -Inf 2: no non-missing arguments to max; returning -Inf

2: In min(x, na.rm = TRUE) :2: no non-missing arguments to max; returning -Inf In min(x, na.rm = TRUE) :In min(x, na.rm = TRUE) : 2: no non-missing arguments to min; returning Inf

In min(x, na.rm = TRUE) : no non-missing arguments to min; returning Inf 3: no non-missing arguments to min; returning Inf 3: In min(rowSums(x, na.rm = TRUE), na.rm = TRUE) :3: In min(rowSums(x, na.rm = TRUE), na.rm = TRUE) : no non-missing arguments to min; returning Inf In min(rowSums(x, na.rm = TRUE), na.rm = TRUE) : 3: no non-missing arguments to min; returning Inf In min(rowSums(x, na.rm = TRUE), na.rm = TRUE) : no non-missing arguments to min; returning Inf no non-missing arguments to min; returning Inf

no non-missing arguments to min; returning Inf Execution halted Timing stopped at: 0.017 0 0.076 Timing stopped at: 0.016 0.001 0.066 Timing stopped at: 0.016 0.001 0.065 Timing stopped at: 0.017 0 0.067 Execution halted Execution halted Execution halted Execution halted Error in NMF:::std.divergence.update.h(v, w, h, nbterms = nb, ncterms = nc, : REAL() can only be applied to a 'numeric', not a 'logical' Calls: NMFpredict ... .local -> run -> run -> .local -> updateFun -> In addition: Warning messages: 1: In max(target, na.rm = TRUE) : no non-missing arguments to max; returning -Inf 2: In min(x, na.rm = TRUE) : no non-missing arguments to min; returning Inf 3: In min(rowSums(x, na.rm = TRUE), na.rm = TRUE) : no non-missing arguments to min; returning Inf Timing stopped at: 0.016 0 0.016 Execution halted Error in NMF:::std.divergence.update.h(v, w, h, nbterms = nb, ncterms = nc, : REAL() can only be applied to a 'numeric', not a 'logical' Calls: NMFpredict ... .local -> run -> run -> .local -> updateFun -> In addition: Warning messages: 1: In max(target, na.rm = TRUE) : no non-missing arguments to max; returning -Inf 2: In min(x, na.rm = TRUE) : no non-missing arguments to min; returning Inf 3: In min(rowSums(x, na.rm = TRUE), na.rm = TRUE) : no non-missing arguments to min; returning Inf Timing stopped at: 0.017 0 0.017 Execution halted Running cell state recovery on: Endothelial.cells... Running cell state recovery on: Epithelial.cells... Error in RunJobQueue() : EcoTyper failed. Please check the error message above! Execution halted

My expression matrix is converted from a DESeq2 object using fpkm() and look slike this:

head(fpkm_data2) Gene S1 S2 S3 S4 S5 S5 S6 1 Clu 127.6201328769 36.3023291903 56.591067380 66.228285691 98.2296892403 27.43514276133 78.68055943 2 Cst3 814.1578653673 636.8331494506 853.526679553 1002.911948547 750.0506474894 541.51674015428 611.70565655 3 Actb 314.0446927492 348.3957412291 273.532077088 267.704869702 345.6709689601 375.02871525823 264.08462420 4 Ahnak 31.7004565228 28.6906380297 32.789078313 48.161525613 23.9516445103 26.80028409706 35.03022713 5 Apoe 686.2945440869 599.9594377606 1232.487811685 467.775376284 584.3852006736 268.30298566135 591.72631156 6 Tmsb4x 149.0758873088 166.2052483316 124.761647529 84.234563831 146.4433038527 140.08398689242 106.26484590

Thanks for your help, Chris

chrismahony commented 8 months ago

Any chance you could help with this?

I have also tried converting my raw counts to TPM by using:

count2tpm<- function(count_mx){ count_matrix <- count_mx gene_length <- gene_lengths$Length reads_per_rpk <- count_matrix/gene_length per_mil_scale <- colSums(reads_per_rpk)/1000000 tpm_matrix <- t(t(reads_per_rpk)/per_mil_scale)

Make sure they match the ENSG and gene order

    gene_ind<-  rownames(count_mx) 
    tpm_submatrix <- tpm_matrix[gene_ind,]
    return(tpm_submatrix)

}

But still the same message. Any help would be greatly appreciated, thanks Chris

chrismahony commented 8 months ago

p.s. just a quick check on expression mtx;

table(is.na(counts2_j_TPM))

FALSE 278362

table(is.infinite(counts2_j_TPM))

FALSE 278362

cbsteen commented 4 months ago

Thank you for your interest in EcoTyper, and I apologize for the late response. It seems that you have very few samples, we recommend that EcoTyper be run on > 25 samples.