As of now rheem supports one version of a specific platform per rheem distribution. Though it might be of little value and high complexity to support multiple running versions of a platform per a single JVM (due runtime conflicts, etc...), it is still useful to support(and maintain) different versions of execution platforms. For examples, not all users can immediately migrate their clusters to Spark 2.0.
This issue is to discuss how we should approach platform versioning in general. One approach is to make the Platform(or the Plugin?) class, aware of the version of its underlying backend, and allow the user to specify the version of the platform at the application level.
This approach has the following advantages:
It allows multiple configurations for different versions of the platform per one deployment of rheem
Users can easily specify the version of the platform when creating a Rheemcontext.
Avoids having one distribution per platform version (see second approach).
Second approach is, using some maven tricks, specifying the version of the platform in the pom file, will include the correct modules in the build that relate to that platform, which is probably a lot faster to implement.
There's currently a branch for Rheem compiled for spark 2.1.0:
As of now rheem supports one version of a specific platform per rheem distribution. Though it might be of little value and high complexity to support multiple running versions of a platform per a single JVM (due runtime conflicts, etc...), it is still useful to support(and maintain) different versions of execution platforms. For examples, not all users can immediately migrate their clusters to Spark 2.0.
This issue is to discuss how we should approach platform versioning in general. One approach is to make the Platform(or the Plugin?) class, aware of the version of its underlying backend, and allow the user to specify the version of the platform at the application level. This approach has the following advantages:
Second approach is, using some maven tricks, specifying the version of the platform in the pom file, will include the correct modules in the build that relate to that platform, which is probably a lot faster to implement.
There's currently a branch for Rheem compiled for spark 2.1.0:
https://github.com/rheem-ecosystem/rheem/tree/Rheem-Spark2.0