Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.
Sometimes (specifically on inputs) it makes a lot of sense to enable stateful processing without additional shuffle. This is most valuable on inputs, because after the input, partitioning is no longer defined or guaranteed. We will enhance FlatMap operator, that will take a StatefulUnaryFunctor, which will be sort of rich function with enhanced life-cycle methods (setup(Context), cleanup(Context)) and the apply method will take StatefulContext, with access to StorageProvider. We will consider possibility to add this special version of flatmap operator only attached to inputs. This is optional, though.
Sometimes (specifically on inputs) it makes a lot of sense to enable stateful processing without additional shuffle. This is most valuable on inputs, because after the input, partitioning is no longer defined or guaranteed. We will enhance
FlatMap
operator, that will take aStatefulUnaryFunctor
, which will be sort of rich function with enhanced life-cycle methods (setup(Context)
,cleanup(Context)
) and theapply
method will takeStatefulContext
, with access toStorageProvider
. We will consider possibility to add this special version of flatmap operator only attached to inputs. This is optional, though.