rheem-ecosystem / rheem

Rheem - a cross-platform data processing system
https://rheem-ecosystem.github.io
5 stars 0 forks source link

Remove Tuple class and replace its usage with Tuple2 #38

Open luckyasser opened 7 years ago

sekruse commented 7 years ago

These two classes are slightly different. The Tuple class is used within rheem-core, where it mostly employed to return more than one result value from a function. Tuple2 comes from the rheem-basic module and is thought to provide a standard data type for Rheem. So, the two classes are intentionally different, but also is Tuple2 not visible in rheem-core.

Another consideration could be to use Scala's Tuple2 rather than Rheem's Tuple2. This comes at the expense of bringing the Scala library into rheem-java and also scala.Tuple2 is immutable. On the other hand, it is nicer to work with in the Scala API.

luckyasser commented 7 years ago

-1 for brining the Scala library into rheem-java. I still don't understand why couldn't we keep Tuple2 (or Tuple) in rheem-core, and remove the other from rheem-basic. They're used interchangeably in several packages: for example according to the distinction you made above Tuple should've been used in ZipWithIdMapping class, but it's not the case. The code is the same in both classes (99%), and it is confusing to have the 2 IMO (specially if we have them in the same class). We can have a "data" package in rheem-core and move it there (instead of keeping it in util). Also, semantically, the intended usage for Tuple can still be correctly seen as a very standard data type, so standard(almost primitive) that we put it in rheem-core :)

sekruse commented 7 years ago

I see your point that both classes are virtually identical. Still, let me list some arguments for having this distinction:

My proposal would be to keep this issue open and revisit it, once we have to rewrite the datatypes anyway (potentially in a new major release); e.g., due to mappings to platform-native datatypes or due to consolidations with other datatypes, such as Records.