Netflix / genie

Distributed Big Data Orchestration Service
https://netflix.github.io/genie
Apache License 2.0
1.7k stars 365 forks source link

Is it possible to integrate genie with another netflix project polynote? #927

Closed yuhuali1989 closed 4 years ago

yuhuali1989 commented 4 years ago

@tgianos Hi Tom! Genie is a great project! We have adopted genie as our production job Orchestration in Shareit. Shareit has 1.8 billion user all over the world and ten petabytes data lake based on aws s3.

I am wondering how spark-submit use genie to enable interactive analyse in Netflix. I suppose Netflixers have your own spark branch, and polynote/jupyter users can use spark-submit/pyspark submit genie job without realising genie.

tgianos commented 4 years ago

Hi @yuhuali1989. Are you asking if Genie is used to launch the spark kernel within polynote or if you can submit a job from polynote to the Genie API via some client?

For the former I took a quick glance at the polynote documentation and it says Polynote will use the spark-submit command in order to start isolated kernels, so it needs the spark-submit command to be working properly and available on the PATH of the environment you used to launch the server.

In this case, I'm not sure if the polynote team does though, the data platform organization ships a python CLI tool that wraps instantiation of a Genie Agent. So when a user types spark-submit in the background what's really happening is it's translated into the CLI arguments the Genie Agent accepts to locally launch a job while communicating state back to the Genie server. This means the Agent will take care of downloading all the spark dependencies, setting up the process working directory, and monitoring state changes of that spark process.

For the later since you're working in Scala or Python within polynote you would be able to use either the Genie Java client or pygenie to submit and monitor jobs from notebooks as you would any other client library.

Let us know if that helps

Tom

yuhuali1989 commented 4 years ago

@tgianos Thank you very much! I got it~