Closed RK1912 closed 3 years ago
Hi RK,
Please double check your cell.type.labels. I think you might have NA values in it.
Best,
Tinyi
BayesPrism has been upgraded to v1.2 with a new built-in functions to remove ribosomal / mitochondrial and genes on sex chromosomes. See the updated vignette and help function for more details. Feel free to check it out.
Hi Tinyi ! Thanks for your quick response. I checked my labels and everything seems to be there. But maybe there has been a misunderstanding -- My current data is like so for run.TED(): ref.dat is a 2627 x 10652 data frame where row names are unique cell IDs and column names are gene names. X is the bulk data ( 50 x 10652 data frame) where rows are names of the bulk samples and column names are gene names same as ref.dat cell.type.labels = are cell type names from the metadata file, and each cell type name corresponds to the cell id in ref.dat .
For example : If the row names ( unique cell id ) are "cell_id1", "cell_id2", "cell_id3", "cell_id4" and if the first 2 belong to the cell types "cell_type1" and the last 2 belong to "cell_type2", then the cell.type.labels = "cell_type1", "cell_type1", "cell_type2", "cell_type2".
I am not sure if I misunderstood the cell.type.names or if I should change the row names to cell type names instead of unique ids.
Please let me know if this is the right implementation.
Thanks , RK
Hi RK,
Row names of ref.dat can be unique cell barcodes. Could you do table(is.na(cell.type.labels))? Also try converting data.frame to matrix, and see if it works.
Best,
Tinyi
Hi, I tried both the ways and I still get the same error. I have the data here: https://github.com/RK1912/Deconv_data
Could you please help me figure this out ?
Thanks , RK
Also, another question I have is : Can run.TED() continue to process the bulk samples, even if one sample being processed in a core failed? In this case we can get results for other samples even if one fails.
Thanks, RK
I have tried your data. I did not see any problem in running. Here is the code:
library(TED) X <- readRDS("X.rds") ref.dat <- readRDS("sc.rds") cell_types <- readRDS("cell_types.rds")
tcga.ted <- run.Ted (ref.dat = t(ref.dat), X=t(X),cell.type.labels=cell_types,input.type="scRNA",n.cores=10)
console output as follows:
[1] "removing non-numeric genes..." [1] "removing outlier genes..." Number of outlier genes filtered= 3 [1] "aligning reference and mixture..." [1] "No tumor reference is speficied. Reference profiles are treated equally." [1] "run first sampling" current sample ID:1 2 3 4 5 6 7 8 9 10 [1] "merge subtypes" SC_T1 SC_T4 SC_T3 SC_T6 SC_M3 SC_M2 SC_M1 SC_F4 SC_F3 SC_F2 SC_F1 SC_T2 Min. 0.000 0.000 0.000 0.000 0.002 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1st Qu. 0.054 0.001 0.021 0.000 0.004 0.010 0.007 0.002 0.001 0.001 0.104 0.000 Median 0.097 0.008 0.039 0.000 0.012 0.072 0.048 0.033 0.055 0.009 0.126 0.001 Mean 0.126 0.032 0.041 0.019 0.041 0.084 0.058 0.069 0.094 0.037 0.125 0.002 3rd Qu. 0.187 0.039 0.061 0.032 0.049 0.111 0.073 0.136 0.171 0.047 0.164 0.002 Max. 0.341 0.189 0.085 0.097 0.199 0.254 0.188 0.209 0.298 0.166 0.286 0.016 SC_T5 SC_B4 SC_B2 SC_B1 SC_B3 SC_M4 Min. 0.000 0.001 0.000 0.000 0.001 0.000 1st Qu. 0.009 0.059 0.004 0.001 0.010 0.002 Median 0.019 0.112 0.018 0.009 0.013 0.009 Mean 0.020 0.131 0.031 0.031 0.036 0.025 3rd Qu. 0.023 0.202 0.027 0.051 0.025 0.043 Max. 0.062 0.351 0.109 0.114 0.150 0.080 [1] "pooling information across samples" [1] "run final sampling" current sample ID:1 2 3 4 5 6 7 8 9 10 SC_T1 SC_T4 SC_T3 SC_T6 SC_M3 SC_M2 SC_M1 SC_F4 SC_F3 SC_F2 SC_F1 SC_T2 Min. 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1st Qu. 0.006 0.000 0.001 0.000 0.000 0.001 0.001 0.000 0.000 0.000 0.002 0.000 Median 0.044 0.000 0.024 0.001 0.000 0.037 0.012 0.001 0.009 0.000 0.082 0.000 Mean 0.108 0.066 0.039 0.034 0.043 0.061 0.059 0.058 0.105 0.049 0.108 0.016 3rd Qu. 0.188 0.002 0.055 0.038 0.008 0.048 0.048 0.102 0.192 0.027 0.181 0.002 Max. 0.439 0.600 0.115 0.212 0.317 0.363 0.267 0.221 0.429 0.378 0.330 0.151 SC_T5 SC_B4 SC_B2 SC_B1 SC_B3 SC_M4 Min. 0.000 0.000 0.000 0.000 0.000 0.000 1st Qu. 0.000 0.005 0.000 0.000 0.000 0.000 Median 0.011 0.055 0.004 0.001 0.001 0.007 Mean 0.025 0.096 0.058 0.019 0.033 0.024 3rd Qu. 0.026 0.162 0.008 0.018 0.003 0.033
let me know if you cannot reproduce it
Hi Tinyi , Thanks for your help. I was able to run it previously when I replaced the unique column names in ref.dat with the cell type labels. But I realized I had input.type = "GEP" so maybe that made the difference. I now ran it with your code and I dont see any problems .
Thanks !
Hi I recently came across TED and I am trying to use it for some synovial fluid bulk RNA-seq data but I have been getting the following error when I use run.ted()
Info :
Could you please let me know if I can use this tool for other kinds of data , and if yes, how can I make this work.
Thanks , RK