AntonioDeFalco / SCEVAN

R package that automatically classifies the cells in the scRNA data by segregating non-malignant cells of tumor microenviroment from the malignant cells. It also infers the copy number profile of malignant cells, identifies subclonal structures and analyses the specific and shared alterations of each subpopulation.
https://www.nature.com/articles/s41467-023-36790-9
GNU General Public License v3.0
90 stars 25 forks source link

error in colnames<-(*tmp*, value = colnames(geData)) : attempt to set 'colnames' on an object with less than two dimensions #75

Closed Githubxsw closed 3 months ago

Githubxsw commented 12 months ago

Hello,

when I analyzed CNV using the SCEAN,

My code:

count_mtx <- cancer_Epi@assays$RNA@counts results <- pipelineCNA(count_mtx, sample = "cancer_Epi", par_cores = 8, SUBCLONES = TRUE)

I encountered the following error:

error in colnames<-(tmp, value = colnames(geData)) : attempt to set 'colnames' on an object with less than two dimensions In addition: Warning message: In parallel::mclapply(1:ncol(geData), execMww, mc.cores = ncore) : scheduled cores 3, 7, 10, 11, 15, 16, 20 did not deliver results, all values of the jobs will be affected Timing stopped at: 1018 130.3 225.5

How do I solve this error.

Thank you!

AntonioDeFalco commented 12 months ago

Hi @Githubxsw, Does the analyzed sample belong to the human genome?

Regards.

carbui commented 5 months ago

Hi, I had the same problem (on mouse genome though). I solved it as you suggested in another issue, by using another environment with a different R version and different packages as well. Below the environment from the session info in R studio that worked:

abind 1.4-5
babelgene 22.9
cachem 1.0.8
cli 3.6.2
cluster 2.1.6
codetools 0.2-19 colorspace 2.1-0 cowplot 1.1.3
data.table 1.15.2 deldir 2.0-4
devtools 2.4.5
digest 0.6.35
doParallel 1.0.17 dotCall64 1.1-1
dplyr 1.1.4
ellipsis 0.3.2
fansi 1.0.6
fastDummies 1.7.3 fastmap 1.1.1
fitdistrplus 1.1-11
foreach 1.5.2
fs 1.6.3
future 1.33.1
future.apply 1.11.1 generics 0.1.3
ggplot2 3.5.0
ggrepel 0.9.5
ggridges 0.5.6
globals 0.16.3
glue 1.7.0
goftest 1.2-3
gridExtra 2.3
gtable 0.3.4
htmltools 0.5.7
htmlwidgets 1.6.4
httpuv 1.6.14
httr 1.4.7
ica 1.0-3
igraph 2.0.3
irlba 2.3.5.1
iterators 1.0.14 jsonlite 1.8.8
KernSmoo 2.23-22
later 1.3.2
lattice 0.22-6
lazyeval 0.2.2
leiden 0.4.3.1 lifecycle 1.0.4
listenv 0.9.1
lmtest 0.9-40
magrittr 2.0.3
MASS 7.3-60.0.1 Matrix 1.6-5
matrixStats 1.2.0 memoise 2.0.1 mime 0.12
miniUI 0.1.1.1 msigdbr 7.5.1
munsell 0.5.0
nlme 3.1-164 parallelly 1.37.1
patchwork 1.2.0 pbapply 1.7-2 pillar 1.9.0
pkgbuild 1.4.4
pkgconfig 2.0.3
pkgload 1.3.4
plotly 4.10.4
plyr 1.8.9
png 0.1-8
polyclip 1.10-6
profvis 0.3.8
progressr 0.14.0 promises 1.2.1
purrr 1.0.2
R6 2.5.1
RANN 2.6.1
RColorBrewer 1.1-3
Rcpp 1.0.12
RcppAnnoy 0.0.22
RcppHNSW 0.6.0
remotes 2.5.0
reshape2 1.4.4
reticulate 1.35.0
rlang 1.1.3
ROCR 1.0-11
RSpectra 0.16-1 rstudioapi 0.15.0
Rtsne 0.17
scales 1.3.0
scattermore 1.2
SCEVAN 1.0.1
sctransform 0.4.1
sessioninfo 1.2.2
Seurat 5.0.3
SeuratObject 5.0.1 shiny 1.8.0
sp 2.1-3
spam 2.10-0
spatstat.data 3.0-4
spatstat.explore 3.2-7
spatstat.geom 3.2-9
spatstat.random 3.2-3 spatstat.sparse 3.0-3
spatstat.utils 3.0-4
stringi 1.8.3
stringr 1.5.1
survival 3.5-8
tensor 1.5
tibble 3.2.1
tidyr 1.3.1
tidyselect 1.2.1
urlchecker 1.0.1
usethis 2.2.3
utf8 1.2.4
uwot 0.1.16
vctrs 0.6.5
viridisLite 0.4.2
xtable 1.8-4
zoo 1.8-12

cesimsek commented 5 months ago

I have the same issue with the following code:

results <- SCEVAN::pipelineCNA(count_mtx, 
                                sample = paste0(sample, "_test"),
                                par_cores = 20,
                                SUBCLONES = FALSE, ClonalCN = FALSE,
                                plotTree = FALSE)

And I get:

1] " raw data - genes: 26298 cells: 8149"
[1] "1) Filter: cells > 200 genes"
[1] "low data quality"
[1] "2) Filter: genes > 5% of cells"
[1] "7843 genes past filtering"
[1] "3) Annotations gene coordinates"
Error in `colnames<-`(`*tmp*`, value = colnames(geData)) : 
  attempt to set 'colnames' on an object with less than two dimensions
Timing stopped at: 0.006 1.194 3.097
sessionInfo()

R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

copykat_1.1.0    doParallel_1.0.17   
iterators_1.0.14    foreach_1.5.2     
Seurat_5.0.2      SeuratObject_5.0.1   sp_2.1-3            
(other packages not shown)

Edit: I have increased the memory and decreased the par_cores=10 and it worked for now, but a clear error message could be better in the future.

zpingfeng commented 3 months ago

I have got similar problem and message when using default setting. Now it works after I change the organism="mouse", as I got mouse data and default organism is human

zpingfeng commented 3 months ago

I got below error message, no idea what's that mean and how to solve the problem.

ctl_results <- pipelineCNA(ctl.df,sample = "Control",par_cores =10,organism="mouse") [1] " raw data - genes: 27256 cells: 1692" [1] "1) Filter: cells > 200 genes" [1] "filtered out 1079 cells past filtering 620 cells" [1] "low data quality" [1] "2) Filter: genes > 5% of cells" [1] "1203 genes past filtering" [1] "3) Annotations gene coordinates" [1] "found 0 confident non malignant cells" [1] "1071 genes annotated" [1] "4) Filter: genes involved in the cell cycle" [1] "1037 genes past filtering " [1] "5) Filter: cells > 5genes per chromosome " [1] "6) Log Freeman Turkey transformation" [1] "A total of 48 cells, 1037 genes after preprocessing" [1] "7) Measuring baselines (pure tumor - synthetic normal cells)" [1] "8) Smoothing data" [1] "10) Adjust baseline" [1] "11) plot heatmap" [1] "found 48 tumor cells" [1] "time classify tumor cells: 3.05413103103638" [1] "found 3 subclones" percentage_cells_subsclone_1 percentage_cells_subsclone_2 percentage_cells_subsclone_3 0.2916667 0.3125000 0.3958333 [1] "Segmentation of subclone : 1" [1] "Segmentation of subclone : 2" [1] "Segmentation of subclone : 3" $Control_subclone1 Chr Start End Alteration segm.mean 9 4 116636510 156013610 2 0.223382 12 6 4902917 88828360 -2 -0.206297 17 7 100607410 126800751 2 0.195066 48 19 4192158 9135636 2 0.290378 5 2 126791565 181655660 1 0.121053 7 3 27416162 157316445 -1 -0.163934 21 8 108703100 123404173 1 0.147549 22 9 7931999 22052023 1 0.102454 26 10 5020203 53111622 -1 -0.115894 29 11 3639785 102296631 1 0.144136 42 16 34745210 94469222 -1 -0.134550

$Control_subclone2 Chr Start End Alteration segm.mean 11 5 4803391 36955078 -2 -0.222651 10 4 138454314 156013610 -1 -0.113823 15 6 124931388 148444389 -1 -0.149980

$Control_subclone3 Chr Start End Alteration segm.mean 5 2 59160850 151632593 1 0.099423 9 4 3831334 82705750 -1 -0.133875 13 5 4803391 103855322 -1 -0.165325 35 15 74747852 103498800 1 0.198940 42 19 4192158 32466575 1 0.134320

$Control_clone Chr Start End Alteration segm.mean 46 18 44425061 89769528 2 0.158137 28 10 91116574 128626506 1 0.134476 40 18 6213913 89769528 1 0.114999

$Control_shareSubclone Chr Start End Alteration segm.mean sh_sub Mean Control_share2.7 3 118562129 157316445 -1 -0.090397 2-3 0 Control_share2.16 7 3332955 25398710 -1 -0.163503 2-3 0 Control_share2.29 12 69157722 99450111 1 0.164326 2-3 0 Control_share2.31 13 9276528 115101964 -1 -0.087791 2-3 0 Control_share2.33 14 8296274 51913488 -1 -0.151322 2-3 0

Error in data.frame(x = tsne$Y[, 1], y = tsne$Y[, 2], Subclones = pred) : arguments imply differing number of rows: 0, 48 In addition: There were 17 warnings (use warnings() to see them)

AntonioDeFalco commented 3 months ago

I got below error message, no idea what's that mean and how to solve the problem.

ctl_results <- pipelineCNA(ctl.df,sample = "Control",par_cores =10,organism="mouse") [1] " raw data - genes: 27256 cells: 1692" [1] "1) Filter: cells > 200 genes" [1] "filtered out 1079 cells past filtering 620 cells" [1] "low data quality" [1] "2) Filter: genes > 5% of cells" [1] "1203 genes past filtering" [1] "3) Annotations gene coordinates" [1] "found 0 confident non malignant cells" [1] "1071 genes annotated" [1] "4) Filter: genes involved in the cell cycle" [1] "1037 genes past filtering " [1] "5) Filter: cells > 5genes per chromosome " [1] "6) Log Freeman Turkey transformation" [1] "A total of 48 cells, 1037 genes after preprocessing" [1] "7) Measuring baselines (pure tumor - synthetic normal cells)" [1] "8) Smoothing data" [1] "10) Adjust baseline" [1] "11) plot heatmap" [1] "found 48 tumor cells" [1] "time classify tumor cells: 3.05413103103638" [1] "found 3 subclones" percentage_cells_subsclone_1 percentage_cells_subsclone_2 percentage_cells_subsclone_3 0.2916667 0.3125000 0.3958333 [1] "Segmentation of subclone : 1" [1] "Segmentation of subclone : 2" [1] "Segmentation of subclone : 3" $Control_subclone1 Chr Start End Alteration segm.mean 9 4 116636510 156013610 2 0.223382 12 6 4902917 88828360 -2 -0.206297 17 7 100607410 126800751 2 0.195066 48 19 4192158 9135636 2 0.290378 5 2 126791565 181655660 1 0.121053 7 3 27416162 157316445 -1 -0.163934 21 8 108703100 123404173 1 0.147549 22 9 7931999 22052023 1 0.102454 26 10 5020203 53111622 -1 -0.115894 29 11 3639785 102296631 1 0.144136 42 16 34745210 94469222 -1 -0.134550

$Control_subclone2 Chr Start End Alteration segm.mean 11 5 4803391 36955078 -2 -0.222651 10 4 138454314 156013610 -1 -0.113823 15 6 124931388 148444389 -1 -0.149980

$Control_subclone3 Chr Start End Alteration segm.mean 5 2 59160850 151632593 1 0.099423 9 4 3831334 82705750 -1 -0.133875 13 5 4803391 103855322 -1 -0.165325 35 15 74747852 103498800 1 0.198940 42 19 4192158 32466575 1 0.134320

$Control_clone Chr Start End Alteration segm.mean 46 18 44425061 89769528 2 0.158137 28 10 91116574 128626506 1 0.134476 40 18 6213913 89769528 1 0.114999

$Control_shareSubclone Chr Start End Alteration segm.mean sh_sub Mean Control_share2.7 3 118562129 157316445 -1 -0.090397 2-3 0 Control_share2.16 7 3332955 25398710 -1 -0.163503 2-3 0 Control_share2.29 12 69157722 99450111 1 0.164326 2-3 0 Control_share2.31 13 9276528 115101964 -1 -0.087791 2-3 0 Control_share2.33 14 8296274 51913488 -1 -0.151322 2-3 0

Error in data.frame(x = tsne$Y[, 1], y = tsne$Y[, 2], Subclones = pred) : arguments imply differing number of rows: 0, 48 In addition: There were 17 warnings (use warnings() to see them)

@zpingfeng I think the problem is related to the low quality of the sample from the 1692 starting cells only 48 cells remain, all of which are considered cancerous, you can check with another sample if the problem persists.

AntonioDeFalco commented 3 months ago

I have the same issue with the following code:

results <- SCEVAN::pipelineCNA(count_mtx, 
                                sample = paste0(sample, "_test"),
                                par_cores = 20,
                                SUBCLONES = FALSE, ClonalCN = FALSE,
                                plotTree = FALSE)

And I get:

1] " raw data - genes: 26298 cells: 8149"
[1] "1) Filter: cells > 200 genes"
[1] "low data quality"
[1] "2) Filter: genes > 5% of cells"
[1] "7843 genes past filtering"
[1] "3) Annotations gene coordinates"
Error in `colnames<-`(`*tmp*`, value = colnames(geData)) : 
  attempt to set 'colnames' on an object with less than two dimensions
Timing stopped at: 0.006 1.194 3.097
sessionInfo()

R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

copykat_1.1.0    doParallel_1.0.17   
iterators_1.0.14    foreach_1.5.2     
Seurat_5.0.2      SeuratObject_5.0.1   sp_2.1-3            
(other packages not shown)

Edit: I have increased the memory and decreased the par_cores=10 and it worked for now, but a clear error message could be better in the future.

Yes generally this problem is solved by reducing the number of cores, I have to check how to catch this error when running parallel.

zpingfeng commented 3 months ago

Thanks for reply! It makes sense! I have no problem with other samples.