Closed jacobpwagner closed 4 years ago
Just quick summarizing the approximate speed gains (old mean / new mean) for this single-cytoframe example:
cf_keyword_insert
:
mem: 9.35
h5: 5.99
tile: 6.25
cf_keyword_delete
:
mem: 9.28
h5: 6.23
tile: 6.13
cf_keyword_rename
:
mem: 8.49
h5: 6.7
tile: 6.16
cf_keyword_set
:
mem: 61.4
h5: 51.1
tile: 53.44
So, clearly the biggest gains are in cf_keyword_set
over the best approach previously available before its recent addition in https://github.com/RGLab/flowWorkspace/commit/79b4bf0f057c35611b21e1a56f8613af80c69e71. But 6-9x speedup isn't too bad for the other methods as well.
If this all looks good, I'll go ahead and merge it in and then I can incorporate it in to analogous methods for GatingHierarchy
, cytoset
, and GatingSet
.
The
cf_keyword_
methods currently rely on altering individual keywords at the R level incytoframe
objects before re-assigning the full set of keywords. For example,cf_keyword_delete
:1) Pulls the keywords to an R-level list 2) Removes the appropriate entry and reassigns the full list
The replacement is done by constructing an entirely new
cytolib::KW_PAIR
object and then replacing the full set of keywords usingcytolib::CytoFrame::set_keywords
These changes rely on additional direct keyword manipulation methods added by https://github.com/RGLab/cytolib/pull/43 and aim to:
1) Improve efficiency by manipulating individual key-value pairs at the
cytolib
level to avoid a full construction and replacement viacytolib::CytoFrame::set_keywords
. 2) Allow thecf_keyword_
methods to take vectors of keys and values to both make for more convenient/flexible functions and push the loop iterations down to thecytolib
level.A simple example testing/benchmarking script and its output follow:
The benchmarking output: