seznam / euphoria

Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.
Apache License 2.0
82 stars 11 forks source link

Add parallelism control #228

Open je-ik opened 6 years ago

je-ik commented 6 years ago

After removing explicit partitioning, we have currently no explicit control over the parallelism of executing operators. This affects both batch and stream. There must be a way to give a hint to the translator that certain operation should be parallelized more or less than the input. Options are:

dmvk commented 6 years ago

I think we should never set explicit parallelism, instead we should hint operator with the percentual estimate of increase / decrease in data size, so we can decide parallelism based on the input data.