Closed mikejiang closed 11 years ago
Works for me.
Greg Finak greg.finak@gmail.com
On Mar 26, 2013, at 5:29 PM, Mike Jiang notifications@github.com wrote:
In order for merge multiple GatingSets resulted from different xmls parsing, cdf files needs to be merged, which could take a long time. (e.g. 1500 samples involve 60G of disk IO,which took almost entire day to merge different gatingsets into groups, then drop empty channels for each group and finally merge into one).
Since we are trunking the data at sample level in ncdfFlow, maybe we should consider heterogeneous storage for GatingSet,which is currently available through ncdfFlowSetList object.
We just need to implement ncdfFlowSetList version of:
getData xyplot Which requires some modification on the current methods of (mainly lookup sample and grab the flowFrame from ncdfFlowSetList instead of ncdfFlowSet)
Therefore, the new rbind2 method for GatingSetList will only involves the manipulation of in-memory C structure. The cdf files can be used as they are (not even need to drop non-universal channels)
— Reply to this email directly or view it on GitHub.
Changing the underling data storage from 'ncdfFlowSet' to 'ncdfFlowList' affects the 'GatingSet' more than we originally thought. Besides 'getData' , we need to make changes directly to
[
archive/unarchive
getGate
pData
clone
getSamples
in order get the new 'GatingSet' object work properly. This intrusive way of extension work is risky and may easily break these APIs that are otherwise 'mature' already.
So we may be better off to overload these existing methods for 'GatingSetList', even though it means we need to write some other methods,like,
[[
plotGate
getQAStats
but at least we have things in control and it is gonna keep the original API intact. Also, most of the method is operating on sample level, like 'getQAStats", thus simple wrapper on the existing method of GatingSet should do the job.
closed by #6
In order for merge multiple GatingSets resulted from different xmls parsing, cdf files needs to be merged, which could take a long time. (e.g. 1500 samples involve 60G of disk IO,which took almost entire day to merge different gatingsets into groups, then drop empty channels for each group and finally merge into one).
Since we are chunking the data at sample level in ncdfFlow, maybe we should consider heterogeneous storage for GatingSet,which is currently available through ncdfFlowSetList object.
We just need to implement some ncdfFlowSetList version of methods,e.g.:
Which requires some modification on the current methods of (mainly lookup sample and grab the flowFrame from ncdfFlowSetList instead of ncdfFlowSet)
Therefore, the new rbind2 method for GatingSetList will only involves the manipulation of in-memory C structure. The cdf files can be used as they are (not even need to drop non-universal channels)