navinlabcode / copykat

Other
193 stars 54 forks source link

in the pred.test,only 4925 left,how this happened,can i keep all the cell? #31

Closed lxwang326 closed 2 years ago

lxwang326 commented 3 years ago

in the exp.rawdata ihave 5312 cells

> exp.rawdata <- read.table("expFile.txt", header=T, sep='\t', check.names = F)
> dim(exp.rawdata)
[1] 17140  5312

but in the pred.test,only 4925 left,how this happened,How can i keep all the cells?

> copykat.test <- copykat(rawmat=exp.rawdata, 
+                         id.type="S", 
+                         cell.line="no", 
+                         ngene.chr=5, 
+                         win.size=25, 
+                         KS.cut=0.15, 
+                         sam.name="TNBC1", 
+                         distance="euclidean", 
+                         n.cores=1)
[1] "running copykat v1.0.4"
[1] "step1: read and filter data ..."
[1] "17140 genes, 5312 cells in raw data"
[1] "8427 genes past LOW.DR filtering"
[1] "step 2: annotations gene coordinates ..."
[1] "start annotation ..."
[1] "step 3: smoothing data with dlm ..."
[1] "step 4: measuring baselines ..."
number of iterations= 189 
number of iterations= 245 
number of iterations= 221 
number of iterations= 205 
number of iterations= 275 
number of iterations= 278 
[1] "low confidence in classification"
[1] "start manual mode"
[1] "copykat failed in locating normal cells; manual adjust performed with 69 immune cells"
[1] "step 5: segmentation..."
[1] "too few breakpoints detected; decreased KS.cut to 50%"
[1] "step 6: convert to genomic bins..."
[1] "step 7: adjust baseline ..."
[1] "step 8: final prediction ..."
[1] "step 9: saving results..."
[1] "step 10: ploting heatmap ..."
Time difference of 8.218909 hours
> saveRDS(copykat.test, file = "copykat.test.rds")
> copykat.test <- readRDS(file = "copykat.test.rds")
> end_time <- Sys.time()
> end_time - start_time
Error: object 'start_time' not found
> pred.test <- data.frame(copykat.test$prediction)
> CNA.test <- data.frame(copykat.test$CNAmat)
> head(pred.test)
                           cell.names copykat.pred
AAACCTGAGCTTCGCG.1 AAACCTGAGCTTCGCG.1    aneuploid
AAACCTGAGTGCTGCC.1 AAACCTGAGTGCTGCC.1    aneuploid
AAACCTGCACGGTAAG.1 AAACCTGCACGGTAAG.1    aneuploid
AAACCTGGTTCTGGTA.1 AAACCTGGTTCTGGTA.1    aneuploid
AAACGGGAGAACAATC.1 AAACGGGAGAACAATC.1    aneuploid
AAACGGGAGTCGCCGT.1 AAACGGGAGTCGCCGT.1      diploid
> dim(pred.test)
[1] 4925    2
ccruizm commented 3 years ago

Good day. I have the same question! how to keep all cells and return them in the copykat list. Thanks!

gaobio commented 2 years ago

@ccruizm @lxwang326 sorry for the confusion. The current version added the filtered cells back as 'not defined' cells in the prediction result file

cp-echo commented 2 years ago

the source code about "added the filtered cells back as 'not defined' cells in the prediction result file" may be wrong . res <- data.frame(cbind(c(names(com.preN),ndef), c(com.preN, rep("not.defined",length(ndef))))) there may made a mistake