Closed schienstockd closed 2 years ago
Not familiar with "image based gating" and "Anndata", so cannot comment too much, would you be able to provide an concrete example data and work flow?
We have microscopy images like these and segment them. The idea is to apply flow cytometry gating to gate cells from these images. This is also called 'Histocytometry'. The idea is to link up gating and the images. We currently do that with a combination of R/shiny, plotly and python/napari.
I've put the example data into a zip file: gating_example.zip
In this example, there are two cell types (yellow and blue) stained with two separate fluorophores. The properties of the segmented labels are saved in an Anndata
file (im.h5ad). There is an R version for this. All properties and the mean intensities of the channels are stored in X
of the Anndata
object. mean_intensity_0
is blue and mean_intensity_1
is yellow in this example.
library(anndata)
# read adata file
adata <- read_h5ad("im.h5ad")
adata$X
mean_intensity_0 mean_intensity_1 mean_intensity_2 mean_intensity_3 mean_intensity_4 bbox_min_y bbox_min_x
0 72.32076 9.358491 66.64151 0.000000000 69.188683 0 550
1 77.27631 12.013158 65.34210 0.065789476 74.210526 0 675
2 109.15758 11.278788 34.43636 0.054545455 105.078789 6 655
We convert this file into an FCS file:
library(flowCore)
# convert to FCS file
x <- adata$X
# add metadata
metadata <- list(
name = dimnames(x)[[2]],
desc = paste(dimnames(x)[[2]]))
# Create FCS file metadata - ranges, min, and max settings
metadata$minRange <- apply(x, 2, min)
metadata$maxRange <- apply(x, 2, max)
# create flowframe
x.ff <- new(
"flowFrame",
exprs = as.matrix(x),
description = metadata
)
# save FCS
write.FCS(x.ff, "im.fcs", what = "double")
Then we create a GatingSet
with a biexponential transformation:
library(flowWorkspace)
cs <- load_cytoset_from_fcs("im.fcs", alter.names = TRUE)
# transform cytoframe
transFun <- biexponentialTransform()
transList <- transformList(colnames(adata$X), transFun)
csTrans <- transform(cs, transList)
# create gating set
gs <- GatingSet(csTrans)
Then we utilise plotly
to draw shapes. The coordinates get passed on to the GatingSet
. The following is a simplified example of a rectangle. The coordinates would actually come from plotly
:
# simple gate
gateCoords <- list(
c(4, 5.5), c(2.5, 4)
)
# create matrix
mat <- matrix(
unlist(gateCoords), ncol = 2,
dimnames = list(
c("min", "max"),
c("mean_intensity_0", "mean_intensity_1")))
# make new rectangle gate
rg1 <- rectangleGate(.gate = mat)
flist <- list(rg1)
names(flist) <- sampleNames(gs)
# add pop to set
gs_pop_add(gs, flist,
name = "Blue+",
parent = "root")
recompute(gs)
Then we plot them back onto plotly
and show these populations in the image viewer napari
.
library(ggcyto)
autoplot(gs, "Blue+", "mean_intensity_0", "mean_intensity_1")
My question is whether we could eliminate the FCS file and directly work on the Anndata
file with the GatingSet
. We have a framework to add further measurements to this file such as cell interactions and spatial properties. Ideally we would like to also make these available for gating. At the moment we would have to generate a new FCS file and GatingSet
?
Thanks for your help!
No need to convert FCS, just wrap the matrix data into flowFrame/flowSet, e.g.
> adata <- read_h5ad("../../gating_example/im.h5ad")
> mat = adata$X
> fr = flowFrame(mat)#use flowFrame constructor instead of "new" method
> fr
flowFrame object 'file14fc4a9793fe'
with 184 cells and 26 observables:
name desc range minRange maxRange
$P1 mean_intensity_0 mean_intensity_0 111 0 110
$P2 mean_intensity_1 mean_intensity_1 172 0 171
$P3 mean_intensity_2 mean_intensity_2 131 0 130
$P4 mean_intensity_3 mean_intensity_3 37 0 36
$P5 mean_intensity_4 mean_intensity_4 216 0 215
... ... ... ... ... ...
$P22 oblate oblate 2 0 1
$P23 prolate prolate 4 0 3
$P24 perimeter_to_area perimeter_to_area 20 0 19
$P25 aspect_ratio aspect_ratio 3 0 2
$P26 fill fill 2 0 1
198 keywords are stored in the 'description' slot
> # No need to convert FCS
> fr_list = list(fr)
> sample_names = "im"
> names(fr_list) <- sample_names
> fs = as(fr_list, "flowSet")# convert to flowset
> # Construct gating set directly from flowset
> gs = GatingSet(fs)
> # Data is automatically converted to cytoframe (h5 backend) under the hood
> cs = gs_cyto_data(gs)
> cs
A cytoset with 1 samples.
column names:
mean_intensity_0, mean_intensity_1, mean_intensity_2, mean_intensity_3, mean_intensity_4, bbox_min_y, bbox_min_x, bbox_max_y, bbox_max_x, bbox_area, eccentricity, orientation, perimeter, area, convex_area, equivalent_diameter, extent, feret_diameter_max, major_axis_length, minor_axis_length, solidity, oblate, prolate, perimeter_to_area, aspect_ratio, fill
> list.files(cs_get_uri(cs))
[1] "im.h5"
Ok makes sense - thanks!
Thanks for putting together this framework. We are currently trying to adopt flowWorkspace for image based gating.
Let me know if the following is technically not possible or a stupid idea.
At the moment we are converting an Anndata H5 file to a CytoFrame and then use a GatingSet to create populations.
In theory, would it be possible to use gating directly on the H5 backed Anndata file? I was thinking to adopt the CytoFrame class to make something compatible to working with Anndata.
If that was in theory possible, would you have a suggestion where to start?