RGLab / flowWorkspace

flowWorkspace
GNU Affero General Public License v3.0
44 stars 21 forks source link

convert_legacy_gs: Not a valid GatingSet archiving folder! #350

Closed gen0mic closed 3 years ago

gen0mic commented 3 years ago

Hello, been using these packages for a while, and I have some legacy GatingSets I'm trying to convert to the new all C++ structure. However I'm getting the error (during the save_gs) step: Not a valid GatingSet archiving folder!

convert_legacy_gs( from = "path_to_legacy_gs", to = "path_to_new_gs_dir" ) loading legacy archive... saving to new archive... Error in save_gs(gs, to, cdf = "skip") : Error in .cpp_saveGatingSet(gs@pointer, path = path, cdf = cdf) : Not a valid GatingSet archiving folder! path_to_new_gs_dir File not recognized: path_to_new_gs_dir/long_hash_key

To explain further, it looks like the 'long_hash_key' folder created contains a bunch of .h5 files (that have names matching the .fcs files). I expected the command to create the new updated GatingSet into the new path I created, so that I wasn't destroying my old legacy files.

Is there a proper way to set up the GatingSet archive folder that I have missed?

Let me know if I should provide you with more context about the issue! Thanks for the fantastic packages!

Here is my sessionInfo():

R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] CytoML_2.0.5 flowWorkspace_4.0.6 dplyr_1.0.2

loaded via a namespace (and not attached): [1] Rcpp_1.0.5 plyr_1.8.6 compiler_4.0.2 pillar_1.4.6 cytolib_2.0.3 RColorBrewer_1.1-2 [7] base64enc_0.1-3 tools_4.0.2 zlibbioc_1.34.0 digest_0.6.25 jsonlite_1.7.1 gtable_0.3.0
[13] lifecycle_0.2.0 tibble_3.0.3 lattice_0.20-41 pkgconfig_2.0.3 png_0.1-7 rlang_0.4.7
[19] graph_1.66.0 rstudioapi_0.11 Rgraphviz_2.32.0 yaml_2.2.1 parallel_4.0.2 hexbin_1.28.1
[25] gridExtra_2.3 xml2_1.3.2 stringr_1.4.0 generics_0.0.2 vctrs_0.3.4 stats4_4.0.2
[31] grid_4.0.2 tidyselect_1.1.0 glue_1.4.2 data.table_1.13.0 Biobase_2.48.0 R6_2.4.1
[37] jpeg_0.1-8.1 RBGL_1.64.0 XML_3.99-0.5 latticeExtra_0.6-29 ggplot2_3.3.2 RProtoBufLib_2.0.0 [43] purrr_0.3.4 magrittr_1.5 scales_1.1.1 matrixStats_0.56.0 ellipsis_0.3.1 BiocGenerics_0.34.0 [49] colorspace_1.4-1 flowCore_2.0.1 ncdfFlow_2.34.0 stringi_1.5.3 RcppParallel_5.0.2 munsell_0.5.0
[55] crayon_1.3.4 ggcyto_1.16.0

mikejiang commented 3 years ago

First of all, make sure your path_to_legacy_gs only contains 3 files, here is the example

> list.files(legacy)
[1] "c4UL4QJDG7.pb"      "c4UL4QJDG7.rds"     "file56b4d447940.nc"

Secondly, make sure your path_to_new_gs_dir is an empty or non-existing folder before the converting,

Also, try to run this reproducible code to see if it goes through

dataDir <- system.file("extdata",package="flowWorkspaceData")
legacy <- file.path(dataDir,"/legacy_gs/v1/gs_bcell_auto")
tmp <- tempfile()
convert_legacy_gs(legacy, tmp)

Lastly, it will be helpful to see the actual original error message

gen0mic commented 3 years ago

Thanks for the response.

I can confirm that all three files are present in the legacy gs folder.

I have tried empty and non-existing folders.

Finally, using the provided code produces the same error message:

dataDir <- system.file("extdata",package="flowWorkspaceData")
legacy <- file.path(dataDir,"/legacy_gs/v1/gs_bcell_auto")
tmp <- tempfile()
convert_legacy_gs(legacy, tmp)
loading legacy archive...
saving to new archive...
Error in save_gs(gs, to, cdf = "skip") : 
     Error in .cpp_saveGatingSet(gs@pointer, path = path, cdf = cdf) : 
     Not a valid GatingSet archiving folder! C:\Users\kgillesp\AppData\Local\Temp\RtmpQRqGfK\file32649c714e7
File not recognized: C:\Users\kgillesp\AppData\Local\Temp\RtmpQRqGfK\file32649c714e7\86306293-49d3-4de4-b7e3-e4ca1babcb3f
mikejiang commented 3 years ago

It seems to me the cmd on your windows system

    system(paste0("mv ", h5dir, "/* ", to))#mv h5 files to dest

somehow moved the entire h5dir folder( i.e. 86306293-49d3-4de4-b7e3-e4ca1babcb3f) to the destination path (instead of only its content i.e. h5 files, as supposed to happen) , not sure why , you can try this hack

tmp <- tempfile()
gs <- flowWorkspace:::.load_legacy(from, tmp)
save_gs(gs, to)

which maybe a little slower, but at least should get you going

gen0mic commented 3 years ago

Yes I agree, I felt like it was something to do with file path issues. The solution you provided did work, so that should work for me!

Thank you so much for your help!