RGLab / flowCore

Core flow cytometry infrastructure
43 stars 25 forks source link

how to reorder columns on for flowsets #222

Closed MayaCyTOFnewbie closed 2 years ago

MayaCyTOFnewbie commented 2 years ago

Hi! first of all I want to thank you for this great package!

Describe the bug I am working on a student's flow cytometry fcs files, where the order of the parameters is not the same in all the files. thus, when I try to load all the files to one flowset I get this error: "fcs doesn't have the identical colnames as the other samples!"

I then divided the files according to matching colnames

my question is how can I reorder the colnames so that they match?

To Reproduce here is what I did:

fs = read.flowSet(files, transformation = FALSE, truncate_max_range = FALSE)

000003_spleen_1_040.fcs doesn't have the identical colnames as the other samples! 000007_spleen_2_041.fcs doesn't have the identical colnames as the other samples! 000011_spleen_3_042.fcs doesn't have the identical colnames as the other samples! 000015_spleen_4_043.fcs doesn't have the identical colnames as the other samples! 000017_spleen_5_044.fcs doesn't have the identical colnames as the other samples! Error in validObject(.Object) : invalid class “flowSet” object: Some items identified in the data environment either have the wrong dimension or type.

iter_1 = file.path(raw_data_dir, "itr1")

files <- list.files(file.path(raw_data_dir), pattern = ".fcs$", full.names = T)

fs1 = read.flowSet(files, transformation = FALSE, truncate_max_range = FALSE) colnames(fs1) [1] "FSC-A" "FSC-H" "FSC-W" "SSC-A" "SSC-H" "SSC-W" "R 780/60-A" "V 450/50-A" "YG 780/60-A" "R 730/45-A" "V 525/50-A" [12] "B 530/30-A" "YG 586/15-A" "R 670/30-A" "B 685/35-A" "V 710/50-A" "Time"

add files from second batch to a new fs

iter_2 = file.path(raw_data_dir, "itr2")

define full pathway to the files

files <- list.files(file.path(iter_2), pattern = ".fcs$", full.names = T) fs2 = read.flowSet(files, transformation = FALSE, truncate_max_range = FALSE) colnames(fs2) [1] "FSC-A" "FSC-H" "FSC-W" "SSC-A" "SSC-H" "SSC-W" "V 450/50-A" "R 780/60-A" "YG 780/60-A" "R 730/45-A" "V 525/50-A" [12] "B 530/30-A" "YG 586/15-A" "R 670/30-A" "B 685/35-A" "V 710/50-A" "Time"

colnames(fs1)==colnames(fs2) [1] TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE`

sessionInfo(): R version 4.1.1 (2021-08-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server x64 (build 17763)

Matrix products: default

locale: [1] C

attached base packages: [1] grid tcltk stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] circlize_0.4.13 ComplexHeatmap_2.9.3 data.table_1.14.2 premessa_0.2.6 pals_1.7
[6] CytoML_2.4.0 flowWorkspace_4.4.0 ggpubr_0.4.0 RColorBrewer_1.1-2 forcats_0.5.1
[11] dplyr_1.0.7 purrr_0.3.4 readr_2.0.2 tidyr_1.1.4 tibble_3.1.5
[16] tidyverse_1.3.1 uwot_0.1.10 Matrix_1.3-4 cytofclean_1.0.3 scales_1.1.1
[21] cowplot_1.1.1 tcltk2_1.2-11 pheatmap_1.0.12 cytutils_0.1.0 stringr_1.4.0
[26] flowCut_1.3.1 flowAI_1.23.0 CytoNorm_0.0.6 remotes_2.4.1 ggplot2_3.3.5
[31] FlowSOM_2.1.24 igraph_1.2.7 flowCore_2.4.0 flowDensity_1.27.2 CATALYST_1.17.3
[36] SingleCellExperiment_1.15.2 SummarizedExperiment_1.23.4 Biobase_2.52.0 GenomicRanges_1.45.0 GenomeInfoDb_1.29.5
[41] IRanges_2.27.2 S4Vectors_0.30.0 BiocGenerics_0.39.2 MatrixGenerics_1.5.4 matrixStats_0.61.0
[46] devtools_2.4.2 usethis_2.1.3 FlowRepositoryR_1.23.0

loaded via a namespace (and not attached): [1] scattermore_0.7 knitr_1.36 irlba_2.3.3 multcomp_1.4-17 DelayedArray_0.19.1
[6] RCurl_1.98-1.5 doParallel_1.0.16 generics_0.1.1 ScaledMatrix_1.1.0 callr_3.7.0
[11] TH.data_1.1-0 proxy_0.4-26 ggpointdensity_0.1.0 tzdb_0.1.2 lubridate_1.8.0
[16] xml2_1.3.2 assertthat_0.2.1 viridis_0.6.2 xfun_0.27 hms_1.1.1
[21] evaluate_0.14 fansi_0.5.0 dbplyr_2.1.1 caTools_1.18.2 readxl_1.3.1
[26] Rgraphviz_2.36.0 DBI_1.1.1 ellipsis_0.3.2 RSpectra_0.16-0 ggcyto_1.21.0
[31] ggnewscale_0.4.5 backports_1.3.0 cytolib_2.5.3 RSEIS_4.0-3 RcppParallel_5.1.4
[36] sparseMatrixStats_1.5.3 vctrs_0.3.8 Cairo_1.5-12.2 GEOmap_2.4-4 abind_1.4-5
[41] cachem_1.0.6 withr_2.4.2 ggforce_0.3.3 aws.signature_0.6.0 RPMG_2.2-3
[46] prettyunits_1.1.1 splancs_2.01-42 cluster_2.1.2 dotCall64_1.0-1 crayon_1.4.1
[51] drc_3.0-1 labeling_0.4.2 pkgconfig_2.0.3 tweenr_1.0.2 vipor_0.4.5
[56] pkgload_1.2.3 changepoint_2.2.2 rlang_0.4.11 lifecycle_1.0.1 sandwich_3.0-1
[61] modelr_0.1.8 rsvd_1.0.5 dichromat_2.0-0 cellranger_1.1.0 rprojroot_2.0.2
[66] polyclip_1.10-0 graph_1.70.0 carData_3.0-4 zoo_1.8-9 reprex_2.0.1
[71] base64enc_0.1-3 beeswarm_0.4.0 ggridges_0.5.3 GlobalOptions_0.1.2 processx_3.5.2
[76] png_0.1-7 viridisLite_0.4.0 rjson_0.2.20 bitops_1.0-7 ConsensusClusterPlus_1.57.0 [81] KernSmooth_2.23-20 spam_2.7-0 DelayedMatrixStats_1.15.4 shape_1.4.6 jpeg_0.1-9
[86] rstatix_0.7.0 ggsignif_0.6.3 aws.s3_0.3.21 beachmat_2.9.1 memoise_2.0.0
[91] magrittr_2.0.1 plyr_1.8.6 hexbin_1.28.2 gplots_3.1.1 zlibbioc_1.38.0
[96] compiler_4.1.1 RFOC_3.4-6 plotrix_3.8-2 clue_0.3-60 cli_3.0.1
[101] XVector_0.33.0 ncdfFlow_2.38.0 ps_1.6.0 MASS_7.3-54 tidyselect_1.1.1
[106] stringi_1.7.5 RProtoBufLib_2.5.1 yaml_2.2.1 BiocSingular_1.9.1 latticeExtra_0.6-29
[111] ggrepel_0.9.1 tools_4.1.1 parallel_4.1.1 rio_0.5.27 rstudioapi_0.13
[116] foreach_1.5.1 foreign_0.8-81 gridExtra_2.3 MBA_0.0-9 farver_2.1.0
[121] Rtsne_0.15 rgeos_0.5-8 digest_0.6.28 BiocManager_1.30.16 Rcpp_1.0.7
[126] car_3.0-11 broom_0.7.9 scuttle_1.3.1 RcppAnnoy_0.0.19 IDPmisc_1.1.20
[131] httr_1.4.2 colorspace_2.0-2 rvest_1.0.2 XML_3.99-0.8 fs_1.5.0
[136] splines_4.1.1 fields_12.5 RBGL_1.68.0 scater_1.21.3 sp_1.4-5
[141] mapproj_1.2.7 sessioninfo_1.1.1 jsonlite_1.7.2 testthat_3.1.0 R6_2.5.1
[146] pillar_1.6.4 htmltools_0.5.2 nnls_1.4 glue_1.4.2 fastmap_1.1.0
[151] BiocParallel_1.27.4 BiocNeighbors_1.11.0 class_7.3-19 codetools_0.2-18 maps_3.4.0
[156] pkgbuild_1.2.0 mvtnorm_1.1-3 utf8_1.2.2 lattice_0.20-45 flowViz_1.57.2
[161] Rwave_2.6-0 curl_4.3.2 ggbeeswarm_0.6.0 colorRamps_2.3 gtools_3.9.2
[166] zip_2.2.0 openxlsx_4.2.4 survival_3.2-13 rmarkdown_2.11 desc_1.4.0
[171] munsell_0.5.0 e1071_1.7-9 GetoptLong_1.0.5 GenomeInfoDbData_1.2.6 iterators_1.0.13
[176] haven_2.4.3 reshape2_1.4.4 gtable_0.3.0

Many thanks!

SamGG commented 2 years ago

Hi, Did you look at the answer of https://github.com/RGLab/flowCore/issues/219 Best

MayaCyTOFnewbie commented 2 years ago

Hi!

I think I managed :) it was much more simple than I taught. I reordered the colnames according to the first flowset this way:

fs_main_names = colnames(fs1)

fs2b = fs2[,fs_main_names]

I hope this was the way to go

MayaCyTOFnewbie commented 2 years ago

OK, sorry, I missed issue #219 before I asked

It worked!

thanks again!

Maya

SamGG commented 2 years ago

Thanks for your feedback. Happy you succeeded. Closed issues are not somehow hidden. Best.