seznam / euphoria

Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.
Apache License 2.0
82 stars 11 forks source link

Add stateful mapping #192

Closed je-ik closed 6 years ago

je-ik commented 7 years ago

Sometimes (specifically on inputs) it makes a lot of sense to enable stateful processing without additional shuffle. This is most valuable on inputs, because after the input, partitioning is no longer defined or guaranteed. We will enhance FlatMap operator, that will take a StatefulUnaryFunctor, which will be sort of rich function with enhanced life-cycle methods (setup(Context), cleanup(Context)) and the apply method will take StatefulContext, with access to StorageProvider. We will consider possibility to add this special version of flatmap operator only attached to inputs. This is optional, though.

je-ik commented 6 years ago

Inputs are handled by beam IOs and we have found no useful application of stateful mapping.