save_gs and read_gs do not give exactly the same numerical results #258

Closed mlamarin closed 6 years ago

mlamarin commented 6 years ago

Dear RGLab team,

Thanks for your great work, it eases so much my daily work life.
I came recently to something possibly minor and likely caused by the serialization routine.

I hope that the reproducible example will work the same for you. It shows that stored and restored objects are not numerically identical.

Thanks Marc


data_dir <- system.file("extdata",package="flowWorkspaceData")

wsfile <- list.files(data_dir, pattern="manual.xml",full=TRUE)
ws <- flowWorkspace::openWorkspace(wsfile);
gs <- flowWorkspace::parseWorkspace(ws, path = data_dir, name = 4,
  subset = c("CytoTrol_CytoTrol_1.fcs", "CytoTrol_CytoTrol_2.fcs"))

# resave and reload the gated object.
flowWorkspace::save_gs(gs, 'tmp2')
gs_loaded <- flowWorkspace::load_gs('tmp2')

# original gates values
ga_1_cd38mDRm <- flowWorkspace::getGate(gs[[1]], "/not debris/singlets/CD3+/CD8/38- DR-" )

# reloaded gates values
ga_loaded_1_cd38mDRm <- flowWorkspace::getGate(gs_loaded[[1]], "/not debris/singlets/CD3+/CD8/38- DR-" )

# not equal
ga_loaded_1_cd38mDRm@boundaries == ga_1_cd38mDRm@boundaries
#      <R660-A> <V545-A>
# [1,]    FALSE    FALSE
# [2,]    FALSE    FALSE
# [3,]    FALSE    FALSE
# [4,]    FALSE    FALSE

# small differences between the original object and the reloaded one
ga_loaded_1_cd38mDRm@boundaries[, 1] - ga_1_cd38mDRm@boundaries[, 1]
# [1] -4.835697e-03 -4.835697e-03 -4.388509e-06 -4.388509e-06

ga_loaded_1_cd38mDRm@boundaries[, 2] - ga_1_cd38mDRm@boundaries[, 2]
# [1] -8.111688e-06 -1.261400e-02 -1.261400e-02 -8.111688e-06

mikejiang commented 6 years ago

Thanks for the good question. The difference is due to the numeric precision loss (from 64 to 32 bits) during the serialization. We decided to use 32-bit float to save the space, which I don't think has any impact on gating given such small numeric error

> all.equal(ga_loaded_1_cd38mDRm@boundaries, ga_1_cd38mDRm@boundaries, tol = 3e-8)
[1] TRUE
mlamarin commented 6 years ago

Hi Mike Thanks for the answer, that's absolutely enough I agree. I will add the "tol = 3e-8" in my tests.

Best Marc