Open utdemir opened 5 years ago
I spent a bit of time experimenting today. My first idea was to use inline-java
to directly interface with JVM. However it turns out it adds considerable complexity to the build process.
I've decided on a simpler approach of creating a wrapper Java application responsible for interfacing with YARN and communicating with the Haskell executable. Since we only need one type of message (spawn an executor and return the result) I believe the interface between Java and Haskell will be quite small. Initially, I will probably create a simple protocol using UNIX pipes.
YARN is the most common way to schedule Spark & Hadoop on a cluster.
Supporting it as an executor will enable us to run side-by-side with existing data processing pipelines.