In general, platforms might incur some overhead for initialization (e.g., Spark does so). This overhead cannot be pinned to single Operators and can therefore not be expressed in our current cost model. However, it is an important criterion to determine whether to use only Java (no overhead) or some "heavy-weight" framework. Thus, we should model this overhead explicitly:
[ ] model the overhead as part of PlanImplementations' TimeEstimates
[ ] in LatentOperatorPruningStrategy, treat the usage of a Platform as an interesting property (the initial overhead for single Operators might be redeemed over the complete PlanImplementation)
From @sekruse on July 10, 2016 11:20
In general, platforms might incur some overhead for initialization (e.g., Spark does so). This overhead cannot be pinned to single
Operator
s and can therefore not be expressed in our current cost model. However, it is an important criterion to determine whether to use only Java (no overhead) or some "heavy-weight" framework. Thus, we should model this overhead explicitly:PlanImplementation
s'TimeEstimate
sLatentOperatorPruningStrategy
, treat the usage of aPlatform
as an interesting property (the initial overhead for singleOperator
s might be redeemed over the completePlanImplementation
)Copied from original issue: daqcri/rheem#4