rheem-ecosystem / rheem

Rheem - a cross-platform data processing system
https://rheem-ecosystem.github.io
5 stars 0 forks source link

LoopOperator is missing an API implementation #1

Open jonasrk opened 7 years ago

jonasrk commented 7 years ago

This would make it much easier to work with.

JorgeQuiane commented 7 years ago

Hi Jonas, first of all welcome to the dRHEEMing community!

Now, could you please give a bit more details about this issue? what do you mean with an API for the Loop operator? In other words, what would you expect Rheem to provide to make your life easier?

jonasrk commented 7 years ago

Hi Jorge, nice meeting you!

What I mean is this: The repeat operator for example has its Java and Scala APIs defined in the package rheem-api in the file DataQuanta.scala in functions repeatJava and repeat .

The LoopOperator does not have APIs like this defined (yet) and therefore can only be used in a more low-level fashion.

sekruse commented 7 years ago

The API should look something like this:

val myLoopResult = planBuilder.load(...)
  .loop { (convergence, data) => (convergence.map(...), data.filter(...).map(...) }
  .collect()
JorgeQuiane commented 7 years ago

Ok, It is clear now. I think this issue is a bug as each operator should be exposed properly to users. What do you think?

zkaoudi commented 7 years ago

What's the difference with the doWhile(...)?

sekruse commented 7 years ago

The DoWhileOperator allows to change only a single dataset from iteration to iteration. The LoopOperator, in contrast, permits to such mutable datasets.