Open makrushin-evgenii opened 12 months ago
A few more details. The calculation takes from 5 to 15 minutes, the result in csv format weighs about a gigabyte, often less. I don't need LivySession
ability to transfer session between threads/instances, but use it because of convenient interface: easy to run a code without need to upload its source to HDFS, easy to get a result without need to download it from HDFS
I create session and download resulting dataframe. The running code itself is not important:
It works fine in staging environment on small amounts of data. Fail with error on large amounts in production environment:
Wherein YARN application finished with succeed status:
Livy logs looks like:
It seems the session is deleted before I can download the result. Why might this happen and how to fix it?
I also tried to handle downloaded dataframe in
with
scope. And do not usewith
at all. It doesn't change anything: i got same error at the moment ofdownload
call