baj12 commented 4 years ago

problem:

` library(flowWorkspace)

path <- system.file("extdata",package="flowWorkspaceData") gs_path <- list.files(path, pattern = "gs_manual",full = TRUE) gs <- load_gs(gs_path)

popDat = gh_pop_get_data(gs[[1]], "CD3+") dat = exprs(popDat) mm = matrix(ncol=1, 1:nrow(popDat), dimnames = list(NULL,"idx")) cf_append_cols(popDat, mm) dat2 = exprs(popDat) `

=>

` HDF5-DIAG: Error detected in HDF5 (1.10.6) thread 0:

000: H5Dio.c line 185 in H5Dread(): could not get a validated dataspace from file_space_id

major: Invalid arguments to routine
minor: Bad value

001: H5S.c line 255 in H5S_get_validated_dataspace(): selection + offset not within extent

major: Dataspace
minor: Out of range

Error in cf_getData(object@pointer) : c++ exception (unknown reason)

`

sessionInfo: `

sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Mojave 10.14.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ggcyto_1.16.0 ncdfFlow_2.34.0 BH_1.72.0-3 RcppArmadillo_0.9.900.2.0 flowCore_2.1.2
[6] ggplot2_3.3.2 CytoML_2.1.11 flowWorkspace_4.1.9

loaded via a namespace (and not attached): [1] tidyselect_1.1.0 xfun_0.16 purrr_0.3.4 lattice_0.20-41 colorspace_1.4-1 vctrs_0.3.4
[7] generics_0.0.2 stats4_4.0.2 yaml_2.2.1 base64enc_0.1-3 XML_3.99-0.5 RBGL_1.64.0
[13] rlang_0.4.7 hexbin_1.28.1 pillar_1.4.6 withr_2.2.0 glue_1.4.2 aws.s3_0.3.21
[19] Rgraphviz_2.32.0 BiocGenerics_0.34.0 RColorBrewer_1.1-2 matrixStats_0.56.0 jpeg_0.1-8.1 lifecycle_0.2.0
[25] plyr_1.8.6 zlibbioc_1.34.0 RProtoBufLib_2.1.0 munsell_0.5.0 gtable_0.3.0 cytolib_2.1.18
[31] latticeExtra_0.6-29 Biobase_2.48.0 knitr_1.29 parallel_4.0.2 curl_4.3 Rcpp_1.0.5
[37] scales_1.1.1 S4Vectors_0.26.1 jsonlite_1.7.0 RcppParallel_5.0.2 graph_1.66.0 gridExtra_2.3
[43] png_0.1-7 digest_0.6.25 dplyr_1.0.2 grid_4.0.2 tools_4.0.2 magrittr_1.5
[49] tibble_3.0.3 crayon_1.3.4 aws.signature_0.6.0 pkgconfig_2.0.3 ellipsis_0.3.1 data.table_1.13.0
[55] xml2_1.3.2 httr_1.4.2 rstudioapi_0.11 R6_2.4.1 compiler_4.0.2
`

Description: I am in the process of my old R code around FlowSOM to work with flowWorkspace and the other packages that you guys are developing. So this is probably the first of many such messages.

The data I am looking at has a big gap in the time axis, which creates a "hole" in the plots. I would like to create a new column with simply the index of the rows (ordered by the time variable if possible). To do so I wanted to add a column using cf_append_cols but then I ran into the problem above. I.e. I am creating a simple matrix and would like to add it, pretty much the same way it is described in the Help example.

Any help is very much appreciated.

Best,

Bernd

baj12 commented 4 years ago

It seems I have to do a deep copy using realize_view:

` pop2Dat = realize_view(popDat, tempfile(pattern="popData"))

cf_append_cols(pop2Dat, mm)

exprs(popDat) `

works. Is this the way to go? Thx for the clarification.

B

jacobpwagner commented 4 years ago

Correct, cf_append_cols will directly modify the cytoframe passed to it, so depending on your use it is probably best to make a deep copy first as you mention. That said, I am still looking in to the source of your error and will update this issue with what I find.

jacobpwagner commented 4 years ago

@baj12 , your issue actually exposed an issue with read-only protection of cytoframes with on-disk backends, so thanks for that. That is fixed by https://github.com/RGLab/cytolib/commit/5afbec8cabdf012b08f19857d440ed139f0c5d82, so now your example will appropriately be blocked from modifying the data in flowWorkspaceData.

> library(flowCore)
> library(flowWorkspace)
> 
> path <- system.file("extdata",package="flowWorkspaceData")
> gs_path <- list.files(path, pattern = "gs_manual",full = TRUE)
> gs <- load_gs(gs_path)
> 
> popDat <- gh_pop_get_data(gs[[1]], "CD3+")
> dat <- exprs(popDat)
> mm <- matrix(ncol=1, 1:nrow(popDat), dimnames = list(NULL,"idx"))
> # Now appropriately guarded
> cf_append_cols(popDat, mm)
 Error in append_cols(cf@pointer, colnames(cols), cols) : 
  Can't write to the read-only H5CytoFrame object!

You are right that to append columns to a cytoframe, you will have to do one of two things: 1) realize_view to get a deep copy 2) If you are sure it is okay to directly modify the on-disk backend files for a cytoframe, you can use cf_unlock to remove read-only protection on it or cs_unlock to remove read-only protection for the entire cytoset.

Here is an example of how to use the second approach

> library(flowCore)
> library(flowWorkspace)
As part of improvements to flowWorkspace, some behavior of
GatingSet objects has changed. For details, please read the section
titled "The cytoframe and cytoset classes" in the package vignette:

  vignette("flowWorkspace-Introduction", "flowWorkspace")
> fcs_files <- list.files(system.file("extdata", package = "flowWorkspaceData"), pattern = "CytoTrol", full.names = TRUE)
> 
> cs_tmp <- load_cytoset_from_fcs(fcs_files)
> 
> # Simulate a saved cytoset that you might not want to automatically alter
> tmp <- tempfile()
> save_cytoset(cs_tmp, tmp)
Done
To reload it, use 'load_cytoset' function

> 
> cs <- load_cytoset(tmp)
> mm <- matrix(ncol=1, 1:nrow(cs[[1]]), dimnames = list(NULL,"idx"))
> # This will be guarded and appropriately error out
> cf_append_cols(cs[[1]], mm)
 Error in append_cols(cf@pointer, colnames(cols), cols) : 
  Can't write to the read-only H5CytoFrame object! > 
> # Explicitly remove read-only guards on entire cytoset
> cs_unlock(cs)
> 
> # Now this should succeed
> cf_append_cols(cs[[1]], mm)
> dat <- exprs(cs[[1]])
> head(dat)
         FSC-A  FSC-H    FSC-W     SSC-A   B710-A   R660-A    R780-A    V450-A   V545-A   G560-A   G780-A Time idx
[1,] 140733.05 133376 69150.98  91113.96 22311.24 35576.07  14302.16 16232.649  7644.65  4113.60 12672.00  0.2   1
[2,]  26195.32  26207 65506.79  10115.28     5.04   447.93    682.56    43.700    77.90   -91.20    18.24  0.4   2
[3,]  64294.02  51594 81667.89 174620.03   371.28   851.62    -66.36   335.350   971.85   273.60   271.68  0.6   3
[4,] 128393.87 103613 81210.08 150625.44  1494.36  5672.20   2979.09  1492.450 28790.70   771.84   988.80  0.6   4
[5,] 127717.88 119616 69974.92  76954.91  2545.20  2272.83 124635.93  8608.899  4190.45 14306.88 58977.60  0.7   5
[6,] 134347.02 125651 70071.60  70116.48 23052.96  1758.54   5281.15  4849.750  2859.50  2249.28  1560.96  0.7   6

baj12 commented 4 years ago

This seems to work with a cytoset/cytoframe, but what about the GatingSet and GatingHierachy? Here, I get the following error message:

> cf_unlock(gs2[[1]])
Error in cf_unlock(gs2[[1]]) : is(cf, "cytoframe") is not TRUE

gh_pop_get_data returns a cytoframe, but somehow this pointer is then different from the pointer in the GatingHierachy.

Maybe to understand better what I really want, here my two problems:

the Time variable has a big gap in the data (don't ask why, please), but I still want to plot some markers against the time for quality assurance. Thus, I wanted to create an index variable.
I am going to run FlowSOM on a cell type (e.g. CD3+) from the manual gating and then associate the resulting clusters/SOM nodes with the individual cells. To export them back to FlowJo or to do the comparisons to the manual gating (still need to figure this one out, too).

Thus, I believe I need to work with gatingSet/GatingHierachie and not with the cytoset/frame classes.

Funny enough, when I use lapply, I don't get an error message, but when trying to access the data I get the following error:

> exprs(popDat)
HDF5-DIAG: Error detected in HDF5 (1.10.6) thread 0:
  #000: H5Dio.c line 185 in H5Dread(): could not get a validated dataspace from file_space_id
    major: Invalid arguments to routine
    minor: Bad value
  #001: H5S.c line 255 in H5S_get_validated_dataspace(): selection + offset not within extent
    major: Dataspace
    minor: Out of range
Error in cf_getData(object@pointer) : c++ exception (unknown reason)

(this is all with cytolib_2.1.18 (master, downloaded a few hours after your patch)

DillonHammill commented 4 years ago

Just pointing out that you can bypass the need for exporting to flowJo if you use CytoExploreR to do your manual gating: https://dillonhammill.github.io/CytoExploreR/

jacobpwagner commented 4 years ago

@baj12 , first, you are right about the GatingHierarchy/GatingSet distinction from cytoframe/cytoset. cytoframe/cytoset keep track of the core data matrix and its metadata and are essentially the successors to flowFrame/flowSet. GatingHierarchy and GatingSet add in automatically keeping track of compensation, transformation, and gating in one data class. That cf_unlock/cs_unlock mechanism is there because the cytoframe class carries an internal read-only flag for the on-disk backends to prevent accidental changes being made to the data matrix. Anyway, your call fails because you are making it on a GatingHierarchy (which you can think of as kind of wrapping around a cytoframe):

> cf_unlock(gs2[[1]])
Error in cf_unlock(gs2[[1]]) : is(cf, "cytoframe") is not TRUE

If you want to remove the read-only lock on the cytoframe, you need to call it on the underlying data class (cytoframe or cytoset). These should both succeed.

cf_unlock(gh_pop_get_data(gs[[1]]))
cs_unlock(gs_pop_get_data(gs))

If you want to avoid this entirely, you can pass the backend_readonly=FALSE argument when you call load_gs or load_cytoset to load a dataset from disk.

jacobpwagner commented 4 years ago

Regarding the lack of error when using lapply, can you give me a reproducible example or code snippet so I can see what you mean? Also, keep in mind that if you executed your initial snippet before that permissions patch, your flowWorkspaceData backend files are altered/corrupted, so you should devtools::install_github("RGLab/flowWorkspaceData", force = TRUE) first to get a blank slate there.

jacobpwagner commented 4 years ago

Trying to bypass it with lapply as you mentioned, but I'm still seeing the appropriate guard on the cytoframe:

> library(flowCore)
> library(flowWorkspace)
> fcs_files <- list.files(system.file("extdata", package = "flowWorkspaceData"), pattern = "CytoTrol", full.names = TRUE)
> cs_tmp <- load_cytoset_from_fcs(fcs_files)
> # Simulate a saved cytoset that you might not want to automatically alter
> tmp <- tempfile()
> save_cytoset(cs_tmp, tmp)
Done
To reload it, use 'load_cytoset' function

> cs <- load_cytoset(tmp)
> mm <- matrix(ncol=1, 1:nrow(cs[[1]]), dimnames = list(NULL,"idx"))
> lapply(c(1), function(idx){
+   cf_append_cols(cs[[idx]], mm)
+ })
 Error in append_cols(cf@pointer, colnames(cols), cols) : 
  Can't write to the read-only H5CytoFrame object!

baj12 commented 4 years ago

Here is my min. example: you don't need the unlock to get this error. As you can see I already cloned the data...

devtools::install_github("RGLab/flowWorkspaceData", force = TRUE)
library(flowCore)
library(flowWorkspace)
library(openCyto)
library(ggcyto)
gs <- load_gs(system.file("extdata/gs_DC_auto", package = "flowWorkspaceData"))

gs_get_pop_paths(gs, path = "auto")
gs2 <- gs_clone(gs)

lapply(gs2, FUN = function(x){
  popDat = gh_pop_get_data(x, "Live")
  cf_unlock(popDat)
  # popDat2 = realize_view(popDat)
  mm = matrix(ncol=1, 1:nrow(popDat), dimnames = list(NULL,"idx"))
  cf_append_cols(popDat, mm)
})

exprs(gh_pop_get_data(gs2[[1]], "Live"))

jacobpwagner commented 4 years ago

I see, so the error is not coming from appending columns to a cytoframe itself, but rather from the fact that it is already part of a GatingSet. The issue then arises from gh_pop_get_data because the GatingSet does not know about the newly added column. Let me think about how best to handle your situation and work on a fix.

DillonHammill commented 4 years ago

@jacobpwagner, just wanted to confirm whether the following operations are equivalent (i.e. h5_readonly = TRUE is the same as cs_unlock())?

# 1. Turn off readonly flag upon loading
gs <- load_gs("path_to_gs",
h5_readonly = FALSE)

# 2. Load and then turn off readonly flag
gs <- load_gs("path_to_gs")
cs <- gs_cyto_data(gs)
cs_unlock(cs)
gs_cyto_data(gs) <- cs

Oh I see, looks like we should use backend_readonly = TRUE instead.

mikejiang commented 4 years ago

Yes, they are equivalent. And cs_unlock does unlocking in place, so you don't need to assign it back.

DillonHammill commented 4 years ago

Thanks @mikejiang, just wanted to make sure whether unlocking the cytoset also unlocked it for associated GatingSet or whether these were being treated independently.

mikejiang commented 4 years ago

gs tree structure is not controlled by cs lock. And since gs itself is not disk based, so it is safe to do any mutable operations to it unless you explicitly sync to disk by save_gs

DillonHammill commented 4 years ago

Thanks for the clarification @mikejiang.

baj12 commented 4 years ago

Thanks for the effort you put into this.

My first problem of adding a new data column to a GatingSet can be "solved" like this:

save_cytoset(gs, "testcyto")
cs = load_cytoset("testcyto")
cs_unlock(cs)
lapply(1:length(cs), FUN = function(x){
  mm = matrix(ncol=1, 1:nrow(cs[[x]]), dimnames = list(NULL,"idx"))
  cf_append_cols(cs[[x]], mm)
})

Kind of defeats the purpose of the gating set, but I am not modifying the original data, I still get the plots I want to get.

jacobpwagner commented 4 years ago

@baj12, just FYI, for the purposes of just getting indices for each row/event, there are already gh_pop_get_indices and gh_pop_set_indices.

> library(flowCore)
> library(flowWorkspace)
> 
> gs <- load_gs(system.file("extdata", "gs_manual", package = "flowWorkspaceData"))
> 
> # boolean mask of events in any given gate, including ungated ("root")
> head(gh_pop_get_indices(gs[[1]], "root"))
[1] TRUE TRUE TRUE TRUE TRUE TRUE
> 
> # Which can be easily converted to numerical indices
> head(which(gh_pop_get_indices(gs[[1]], "root")))
[1] 1 2 3 4 5 6
> 
> # You can pull out the indices for any subpopulation
> head(which(gh_pop_get_indices(gs[[1]], "CD3+")))
[1]  1  5  6  9 10 15

For adding a column labeling by cluster assignment from FlowSOM, you'll probably just have to make sure every row in the underlying cytoframe has some index assigned. So for the rows not in the target population, you'd need a dummy index for no assignment or something.

baj12 commented 4 years ago

gh_pop_set_indices modifies an existing population. I am not allowed to create one. This is how I think of the internal representation of the GH. Though this might also be calculated on the fly (not sure which it is).

jacobpwagner commented 4 years ago

@baj12 . First, my apologies, I found and patched one other issue, so please re-install cytolib before trying the following.

By pointing to gh_pop_get_indices, I just meant that you can use the row indices to build your new index column by assigning the appropriate values to the appropriate subpopulation rows. See the example below:

library(flowCore)
library(flowWorkspace)
gs <- load_gs(system.file("extdata/gs_manual", package = "flowWorkspaceData"))

gs_get_pop_paths(gs, path = "auto")
gs2 <- gs_clone(gs)

cs <- cytoset(lapply(gs2, FUN = function(x){
  # This cytoframe is really a view into the original cytoframe (using the subpopulation as index)
  cf_subpop <- gh_pop_get_data(x, "CD3+")
  # The key is that you need to make sure there is a value for every row
  # in the entire cytoframe, not just the subpopulation
  cd3_pop_rows = gh_pop_get_indices(x, "CD3+")
  # Make sure every row gets some value (even if you won't end up using it)
  idx_vec <- rep(-1, nrow(x))
  # Then add the values you need for your subpopulation (time index or cluster assignment)
  idx_vec[cd3_pop_rows] <- 1:nrow(cf_subpop)
  mm = matrix(ncol=1, idx_vec, dimnames = list(NULL,"idx"))
  # This cytoframe has all of the rows
  cf_whole <- gh_pop_get_data(x)
  cf_append_cols(cf_whole, mm)
}))

gs_cyto_data(gs2) <- cs

# The subpopulation has whatever indices you gave it
head(exprs(gh_pop_get_data(gs2[[1]], "CD3+")))

# While the other rows just have the filler value
head(exprs(gh_pop_get_data(gs2[[1]])))

With output included:

> library(flowCore)
> library(flowWorkspace)
> gs <- load_gs(system.file("extdata/gs_manual", package = "flowWorkspaceData"))
> 
> gs_get_pop_paths(gs, path = "auto")
 [1] "root"            "not debris"      "singlets"        "CD3+"            "CD4"             "CD4/38- DR+"    
 [7] "CD4/38+ DR+"     "CD4/38+ DR-"     "CD4/38- DR-"     "CD4/CCR7- 45RA+" "CD4/CCR7+ 45RA+" "CD4/CCR7+ 45RA-"
[13] "CD4/CCR7- 45RA-" "CD8"             "CD8/38- DR+"     "CD8/38+ DR+"     "CD8/38+ DR-"     "CD8/38- DR-"    
[19] "CD8/CCR7- 45RA+" "CD8/CCR7+ 45RA+" "CD8/CCR7+ 45RA-" "CD8/CCR7- 45RA-" "DNT"             "DPT"            
> gs2 <- gs_clone(gs)
> 
> cs <- cytoset(lapply(gs2, FUN = function(x){
+     cf_subpop <- gh_pop_get_data(x, "CD3+")
+     cd3_pop_rows = gh_pop_get_indices(x, "CD3+")
+     idx_vec <- rep(-1, nrow(x))
+     idx_vec[cd3_pop_rows] <- 1:nrow(cf_subpop)
+     mm = matrix(ncol=1, idx_vec, dimnames = list(NULL,"idx"))
+     cf_whole <- gh_pop_get_data(x)
+     cf_append_cols(cf_whole, mm)
+ }))
> 
> gs_cyto_data(gs2) <- cs
> 
> head(exprs(gh_pop_get_data(gs2[[1]], "CD3+")))
         FSC-A  FSC-H    FSC-W    SSC-A <B710-A> <R660-A>  <R780-A> <V450-A> <V545-A> <G560-A> <G780-A>  Time idx
[1,] 140733.05 133376 69150.98 91113.96 3106.004 3302.719 2073.3540 2996.536 1795.198 2397.642 2860.906 0.002   1
[2,] 127717.88 119616 69974.92 76954.91 1296.811 1931.995 3769.8376 2749.284 1660.914 2883.802 3480.662 0.007   2
[3,] 134347.02 125651 70071.60 70116.48 3128.845 1834.073 1607.8027 2507.708 1727.784 2194.188 1001.111 0.007   3
[4,]  84330.34  77826 71013.20 66052.55 1465.256 1523.407 3702.2285 2657.550 1612.924 2800.220 3415.069 0.015   4
[5,] 101553.95  96602 68895.48 44620.80 2902.931 2458.440  482.8756 2700.848 1561.626 2759.033 3263.364 0.015   5
[6,]  87721.18  79682 72147.98 66181.08 2928.725 1382.240 1510.5111 2500.461 1501.881 2223.328 1044.804 0.020   6
> 
> head(exprs(gh_pop_get_data(gs2[[1]])))
         FSC-A  FSC-H    FSC-W     SSC-A  <B710-A> <R660-A>  <R780-A>  <V450-A>  <V545-A>  <G560-A>  <G780-A>  Time
[1,] 140733.05 133376 69150.98  91113.96 3106.0037 3302.719 2073.3540 2996.5356 1795.1975 2397.6416 2860.9060 0.002
[2,]  26195.32  26207 65506.79  10115.28  696.8729 1394.616 1401.4686  637.6926  749.2054  173.6164  672.4402 0.004
[3,]  64294.02  51594 81667.89 174620.03 1285.8862 1710.609  720.2574 1413.2173 1766.4376 1258.1320 1191.3597 0.006
[4,] 128393.87 103613 81210.08 150625.44 1925.9648 2555.242 1872.6464 1602.2124 3219.9639 1725.0957 1711.5504 0.006
[5,] 127717.88 119616 69974.92  76954.91 1296.8107 1931.995 3769.8376 2749.2837 1660.9139 2883.8020 3480.6624 0.007
[6,] 134347.02 125651 70071.60  70116.48 3128.8455 1834.073 1607.8027 2507.7080 1727.7843 2194.1882 1001.1113 0.007
     idx
[1,]   1
[2,]  -1
[3,]  -1
[4,]  -1
[5,]   2
[6,]   3

This workaround is necessary because each GatingHierarchy uses one underlying data matrix. Subpopulations determined by gating are then represented by indexed views in to that data matrix. So, in order to properly add a column that will be seen by the GatingHierarchy (or its containing GatingSet) moving forward, you need to add a complete column to the data matrix.

cf_append_columns was added with the main use case in mind being appending a column to the entire data matrix, usually before any gating operations have been performed. This subpopulation column addition within the structure of the original GatingSet is sort of a new use case.

I should also mention that you could just make a copy of the cytoframe or cytoset for your subopulation and append to that. If you will not need the GatingSet for future steps and only care about the target subpopulation, that's a fine option:

> library(flowCore)
> library(flowWorkspace)
> gs <- load_gs(system.file("extdata/gs_manual", package = "flowWorkspaceData"))
> 
> gs_get_pop_paths(gs, path = "auto")
 [1] "root"            "not debris"      "singlets"        "CD3+"            "CD4"             "CD4/38- DR+"    
 [7] "CD4/38+ DR+"     "CD4/38+ DR-"     "CD4/38- DR-"     "CD4/CCR7- 45RA+" "CD4/CCR7+ 45RA+" "CD4/CCR7+ 45RA-"
[13] "CD4/CCR7- 45RA-" "CD8"             "CD8/38- DR+"     "CD8/38+ DR+"     "CD8/38+ DR-"     "CD8/38- DR-"    
[19] "CD8/CCR7- 45RA+" "CD8/CCR7+ 45RA+" "CD8/CCR7+ 45RA-" "CD8/CCR7- 45RA-" "DNT"             "DPT"            
> gs2 <- gs_clone(gs)
> 
> # Just pull the data you need
> cf <- realize_view(gh_pop_get_data(gs2[[1]]))
> 
> # And append to that
> mm = matrix(ncol=1, 1:nrow(cf), dimnames = list(NULL,"idx"))
> cf_append_cols(cf, mm)
> head(exprs(cf))
         FSC-A  FSC-H    FSC-W     SSC-A  <B710-A> <R660-A>  <R780-A>  <V450-A>  <V545-A>  <G560-A>  <G780-A>  Time
[1,] 140733.05 133376 69150.98  91113.96 3106.0037 3302.719 2073.3540 2996.5356 1795.1975 2397.6416 2860.9060 0.002
[2,]  26195.32  26207 65506.79  10115.28  696.8729 1394.616 1401.4686  637.6926  749.2054  173.6164  672.4402 0.004
[3,]  64294.02  51594 81667.89 174620.03 1285.8862 1710.609  720.2574 1413.2173 1766.4376 1258.1320 1191.3597 0.006
[4,] 128393.87 103613 81210.08 150625.44 1925.9648 2555.242 1872.6464 1602.2124 3219.9639 1725.0957 1711.5504 0.006
[5,] 127717.88 119616 69974.92  76954.91 1296.8107 1931.995 3769.8376 2749.2837 1660.9139 2883.8020 3480.6624 0.007
[6,] 134347.02 125651 70071.60  70116.48 3128.8455 1834.073 1607.8027 2507.7080 1727.7843 2194.1882 1001.1113 0.007
     idx
[1,]   1
[2,]   2
[3,]   3
[4,]   4
[5,]   5
[6,]   6

That way you can use the GatingSet for the transformation and gating operations, but you can always pull out a copy of whatever subset of data, which is similar to getting a legacy R-level flowFrame out from a subpopulation.

jacobpwagner commented 4 years ago

@baj12 , with https://github.com/RGLab/flowWorkspace/commit/b48c8820f86328fd640271b5ab9db32a5b89c274 I added a guard to force the user to explicitly call realize_view on a subsetted cytoframe before appending columns. This means that for the workaround I provided you, this line:

cf_whole <- gh_pop_get_data(x)

will need to change to this

cf_whole <- realize_view(gh_pop_get_data(x))

RGLab / flowWorkspace

cf_append_cols HDF5-DIAG error #346

000: H5Dio.c line 185 in H5Dread(): could not get a validated dataspace from file_space_id

001: H5S.c line 255 in H5S_get_validated_dataspace(): selection + offset not within extent