acroz / pylivy

A Python client for Apache Livy, enabling use of remote Apache Spark clusters.
MIT License
70 stars 32 forks source link

How to run code and download result properly? #108

Open makrushin-evgenii opened 12 months ago

makrushin-evgenii commented 12 months ago

I create session and download resulting dataframe. The running code itself is not important:

with LivySession.create(self.LIVY_URL, kind=SessionKind.PYSPARK, requests_session=self.requests_session, spark_conf=conf) as session:
    session.run(code)
    return session.download(download_dataframe_name)

It works fine in staging environment on small amounts of data. Fail with error on large amounts in production environment:

requests.exceptions.HTTPError: 500 Server Error: Server Error for url: https://***:443/gateway/production/livy/sessions/655/statements/1
{"msg":"Session '655' not found."}

Wherein YARN application finished with succeed status: enter image description here

Livy logs looks like:

23/11/15 14:06:14 INFO InteractiveSession: Interactive session 656 created [appid: application_1698181251761_0043, owner: knox, proxyUser: Some(e.makrushin), state: idle, kind: pyspark, info: {driverLogUrl=http://***:8042/node/containerlogs/container_e58_1698181251761_0043_01_000001/e.makrushin, sparkUiUrl=http://***/proxy/application_1698181251761_0043/}]
23/11/15 14:09:54 INFO InteractiveSessionManager: Deleting session 656
23/11/15 14:09:54 INFO InteractiveSession: Stopping InteractiveSession 656...
23/11/15 14:09:54 WARN Rpc: [Rpc] Closing RPC channel with 2 outstanding RPCs.
23/11/15 14:09:54 ERROR SessionServlet$: internal error
java.util.concurrent.CancellationException
        at io.netty.util.concurrent.DefaultPromise.cancel(...)(Unknown Source)
23/11/15 14:09:54 INFO InteractiveSession: Stopped InteractiveSession 656.
23/11/15 14:09:54 INFO InteractiveSessionManager: Deleted session 656

It seems the session is deleted before I can download the result. Why might this happen and how to fix it?

I also tried to handle downloaded dataframe in with scope. And do not use with at all. It doesn't change anything: i got same error at the moment of download call

makrushin-evgenii commented 12 months ago

A few more details. The calculation takes from 5 to 15 minutes, the result in csv format weighs about a gigabyte, often less. I don't need LivySession ability to transfer session between threads/instances, but use it because of convenient interface: easy to run a code without need to upload its source to HDFS, easy to get a result without need to download it from HDFS