navinlabcode / copykat

Other
214 stars 55 forks source link

error:dim(X) must have a positive length #26

Closed sunnyyakima closed 2 years ago

sunnyyakima commented 3 years ago

Hi, copykat is really nice package, and return meaningful results. however, for some of my samples, it returned following error:

Kydar12 An object of class Seurat 57874 features across 555 samples within 1 assay Active assay: RNA (57874 features, 0 variable features)

Kydar12_cn <- run_copykat(Kydar12) [1] "running copykat v1.0.4" [1] "step1: read and filter data ..." [1] "2704 genes, 555 cells in raw data" [1] "1928 genes past LOW.DR filtering" [1] "WARNING: low data quality; assigned LOW.DR to UP.DR..." [1] "step 2: annotations gene coordinates ..." [1] "start annotation ..." [1] "step 3: smoothing data with dlm ..." [1] "step 4: measuring baselines ..." number of iterations= 212 number of iterations= 28 [1] "low confidence in classification" Error in apply(rawmat[which(rownames(rawmat) %in% c("PTPRC", "LYZ", "PECAM1")), : dim(X) must have a positive length

gaobio commented 3 years ago

Hi, copykat is really nice package, and return meaningful results. however, for some of my samples, it returned following error:

Kydar12 An object of class Seurat 57874 features across 555 samples within 1 assay Active assay: RNA (57874 features, 0 variable features) Kydar12_cn <- run_copykat(Kydar12) [1] "running copykat v1.0.4" [1] "step1: read and filter data ..." [1] "2704 genes, 555 cells in raw data" [1] "1928 genes past LOW.DR filtering" [1] "WARNING: low data quality; assigned LOW.DR to UP.DR..." [1] "step 2: annotations gene coordinates ..." [1] "start annotation ..." [1] "step 3: smoothing data with dlm ..." [1] "step 4: measuring baselines ..." number of iterations= 212 number of iterations= 28 [1] "low confidence in classification" Error in apply(rawmat[which(rownames(rawmat) %in% c("PTPRC", "LYZ", "PECAM1")), : dim(X) must have a positive length

Hi sunnyyakima, your data quality is very poor according to the log. By default, CopyKAT asks to have at least 5 genes to represent a chromosome. But your data has very low gene coverage, the last straw, you may try to change ngene.chr=5 to ngene.chr=3 or 4. And change default LOW.DR=0.05 to LOW.DR=0.01 to retain more genes, and KS.cut=0.01. These may help to some extent...

Roger-GOAT commented 3 years ago

@gaobio Hi, same issue. I change the code to

> copykat <- copykat(
+   rawmat = counts,
+   id.type = "S",
+   ngene.chr = 3,
+   LOW.DR = 0.01,
+   UP.DR = 0.2,
+   win.size = 25,
+   norm.cell.names = "",
+   KS.cut = 0.01,
+   sam.name = "",
+   distance = "euclidean",
+   n.cores = 16
+ )
[1] "running copykat v1.0.4"
[1] "step1: read and filter data ..."
[1] "22457 genes, 32001 cells in raw data"
[1] "11241 genes past LOW.DR filtering"
[1] "step 2: annotations gene coordinates ..."
[1] "start annotation ..."
[1] "step 3: smoothing data with dlm ..."
[1] "step 4: measuring baselines ..."
number of iterations= 1034 
number of iterations= 437 
number of iterations= 4055 
number of iterations= 693 
number of iterations= 1500 
number of iterations= 854 
[1] "low confidence in classification"
Error in apply(rawmat[which(rownames(rawmat) %in% c("PTPRC", "LYZ", "PECAM1")),  : 
  dim(X) must have a positive length

Thank you. Are the my data quality very poor?

gaobio commented 2 years ago

please confirm that if you used for mouse data. The current version just added a module to support mouse scRNAseq data