saeyslab / CytoNorm

R library to normalize cytometry data
33 stars 6 forks source link

testCV error #12

Closed emmanuelaaaaa closed 3 years ago

emmanuelaaaaa commented 4 years ago

Hello, First of all, thank you for putting this together, batch normalisation in CyTOF data has been a large problem in the community for a while now. I have come across some unexpected behaviour when running testCV. It runs for some values of the cluster_values argument but gives an error under different values. For example: cvs <- testCV(fsom, cluster_values = seq(5,25,by=2)) runs fine but not

cvs <- testCV(fsom, cluster_values = seq(5,25,by=3))
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
  'data' must be of a vector type, was 'NULL'

I was wondering if you have any insight on what could be causing the problem. Many thanks, Emma


R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.10 (Final)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
 [1] LC_CTYPE=en_GB.ISO-8859-1       LC_NUMERIC=C                    LC_TIME=en_GB.ISO-8859-1        LC_COLLATE=en_GB.ISO-8859-1    
 [5] LC_MONETARY=en_GB.ISO-8859-1    LC_MESSAGES=en_GB.ISO-8859-1    LC_PAPER=en_GB.ISO-8859-1       LC_NAME=C                      
 [9] LC_ADDRESS=C                    LC_TELEPHONE=C                  LC_MEASUREMENT=en_GB.ISO-8859-1 LC_IDENTIFICATION=C            

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] flowCore_1.52.1 dplyr_1.0.0     CytoNorm_0.0.5 

loaded via a namespace (and not attached):
 [1] Biobase_2.46.0              splines_3.6.3               jsonlite_1.6.1              ConsensusClusterPlus_1.50.0 R.utils_2.9.2              
 [6] ellipse_0.4.2               gtools_3.8.2                RcppParallel_5.0.1          stats4_3.6.3                latticeExtra_0.6-29        
[11] RBGL_1.62.1                 flowWorkspace_3.34.1        yaml_2.2.1                  robustbase_0.93-6           pillar_1.4.4               
[16] lattice_0.20-41             glue_1.3.2                  digest_0.6.25               RColorBrewer_1.1-2          colorspace_1.4-1           
[21] ggcyto_1.14.1               Matrix_1.2-18               R.oo_1.23.0                 plyr_1.8.6                  pcaPP_1.9-73               
[26] XML_3.99-0.3                pkgconfig_2.0.3             pheatmap_1.0.12             tsne_0.1-3                  fda_5.1.4                  
[31] zlibbioc_1.32.0             purrr_0.3.4                 corpcor_1.6.9               mvtnorm_1.1-1               scales_1.1.1               
[36] jpeg_0.1-8.1                openCyto_1.24.0             flowStats_3.44.0            tibble_3.0.1                generics_0.0.2             
[41] ggplot2_3.3.1               ellipsis_0.3.1              flowViz_1.50.0              BiocGenerics_0.32.0         hexbin_1.28.1              
[46] mnormt_1.5-6                magrittr_1.5                crayon_1.3.4                IDPmisc_1.1.20              mclust_5.4.6               
[51] ks_1.11.7                   R.methodsS3_1.8.0           MASS_7.3-51.6               graph_1.64.0                tools_3.6.3                
[56] data.table_1.12.8           ncdfFlow_2.32.0             flowClust_3.24.0            lifecycle_0.2.0             matrixStats_0.56.0         
[61] stringr_1.4.0               munsell_0.5.0               cluster_2.1.0               compiler_3.6.3              rlang_0.4.6                
[66] grid_3.6.3                  igraph_1.2.5                base64enc_0.1-3             gtable_0.3.0                rrcov_1.5-2                
[71] R6_2.4.1                    gridExtra_2.3               clue_0.3-57                 FlowSOM_1.18.0              CytoML_1.12.1              
[76] KernSmooth_2.23-17          Rgraphviz_2.30.0            stringi_1.4.6               parallel_3.6.3              Rcpp_1.0.4.6               
[81] vctrs_0.3.0                 png_0.1-7                   DEoptimR_1.0-8              tidyselect_1.1.0          ```
SofieVG commented 4 years ago

Hi Emma,

Thanks for trying out our tool! I have not yet solved the issue in the code, but I can already explain what is happening. If the parameter plot is TRUE, the tool will also generate an overview figure, with the different CV values for the upper part of the image, but underneath also the different metacluster percentages for each of the files for the number of metaclusters chosen in the model. This can help to investigate in which cluster the variation is happening and about which kind of percentages the issue is about. However, this part of the plot fails if the number of clusters used in the model is not present in the numbers you pass to test. So e.g. c(seq(5, 25, by = 3), 25) will work in your case, or you could adapt the FlowSOM parameters to choose the number of metaclusters as one that you have in this new list (e.g. 20 or 23). If you are just interested in the CV values themselves and not in the plot you could also put plot to FALSE in which case he will not run into the problematic code.

I know this presents just workarounds for now, but I hope this already helps a bit.

All the best, Sofie

On Thu, 25 Jun 2020 at 14:49, Emma notifications@github.com wrote:

Hello, First of all, thank you for putting this together, batch normalisation in CyTOF data has been a large problem in the community for a while now. I have come across some unexpected behaviour when running testCV. It runs for some values of the cluster_values argument but gives an error under different values. For example: cvs <- testCV(fsom, cluster_values = seq(5,25,by=2)) runs fine but not

cvs <- testCV(fsom, cluster_values = seq(5,25,by=3)) Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), : 'data' must be of a vector type, was 'NULL'

I was wondering if you have any insight on what could be causing the problem. Many thanks, Emma

R version 3.6.3 (2020-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS release 6.10 (Final)

Matrix products: default BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale: [1] LC_CTYPE=en_GB.ISO-8859-1 LC_NUMERIC=C LC_TIME=en_GB.ISO-8859-1 LC_COLLATE=en_GB.ISO-8859-1 [5] LC_MONETARY=en_GB.ISO-8859-1 LC_MESSAGES=en_GB.ISO-8859-1 LC_PAPER=en_GB.ISO-8859-1 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.ISO-8859-1 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] flowCore_1.52.1 dplyr_1.0.0 CytoNorm_0.0.5

loaded via a namespace (and not attached): [1] Biobase_2.46.0 splines_3.6.3 jsonlite_1.6.1 ConsensusClusterPlus_1.50.0 R.utils_2.9.2 [6] ellipse_0.4.2 gtools_3.8.2 RcppParallel_5.0.1 stats4_3.6.3 latticeExtra_0.6-29 [11] RBGL_1.62.1 flowWorkspace_3.34.1 yaml_2.2.1 robustbase_0.93-6 pillar_1.4.4 [16] lattice_0.20-41 glue_1.3.2 digest_0.6.25 RColorBrewer_1.1-2 colorspace_1.4-1 [21] ggcyto_1.14.1 Matrix_1.2-18 R.oo_1.23.0 plyr_1.8.6 pcaPP_1.9-73 [26] XML_3.99-0.3 pkgconfig_2.0.3 pheatmap_1.0.12 tsne_0.1-3 fda_5.1.4 [31] zlibbioc_1.32.0 purrr_0.3.4 corpcor_1.6.9 mvtnorm_1.1-1 scales_1.1.1 [36] jpeg_0.1-8.1 openCyto_1.24.0 flowStats_3.44.0 tibble_3.0.1 generics_0.0.2 [41] ggplot2_3.3.1 ellipsis_0.3.1 flowViz_1.50.0 BiocGenerics_0.32.0 hexbin_1.28.1 [46] mnormt_1.5-6 magrittr_1.5 crayon_1.3.4 IDPmisc_1.1.20 mclust_5.4.6 [51] ks_1.11.7 R.methodsS3_1.8.0 MASS_7.3-51.6 graph_1.64.0 tools_3.6.3 [56] data.table_1.12.8 ncdfFlow_2.32.0 flowClust_3.24.0 lifecycle_0.2.0 matrixStats_0.56.0 [61] stringr_1.4.0 munsell_0.5.0 cluster_2.1.0 compiler_3.6.3 rlang_0.4.6 [66] grid_3.6.3 igraph_1.2.5 base64enc_0.1-3 gtable_0.3.0 rrcov_1.5-2 [71] R6_2.4.1 gridExtra_2.3 clue_0.3-57 FlowSOM_1.18.0 CytoML_1.12.1 [76] KernSmooth_2.23-17 Rgraphviz_2.30.0 stringi_1.4.6 parallel_3.6.3 Rcpp_1.0.4.6 [81] vctrs_0.3.0 png_0.1-7 DEoptimR_1.0-8 tidyselect_1.1.0 ```

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/saeyslab/CytoNorm/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOS723Y7A5DKBTZDPDLZI3RYNBUHANCNFSM4OIKXMPA .

emmanuelaaaaa commented 4 years ago

Thanks Sofie, that's very helpful! Just a follow up clarification regarding the plot: you mentioned that underneath the CV values you get "the different metacluster percentages for each of the files", is that the percentage of cells in each metacluster per batch? Best, Emma

ghost commented 3 years ago

Hello @SofieVG @jacobpwagner ,I am getting an error "Error in t.default(data) : argument is not a matrix "when I use testCV function. How do I resolve it?

Thank you

SofieVG commented 3 years ago

@emmanuelaaaaa Hi Emma, apologies for the delay in reply but this is indeed what I meant. So percentages over all metaclusters will add up to 100% for each batch. This can help to estimate whether a potentially large CV is due to the fact that a cluster is a very small or whether it is still a change you would find problematic.

@Rashmi-pixel I think this might be due to you having an updated version of the FlowSOM package but not the latest version of the CytoNorm package. Please ensure you have the latest version installed. If the problem still occurs, feel free to open a new issue.