jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters
Other
1.33k stars 447 forks source link

livy session timeout #465

Closed RoelantStegmann closed 4 years ago

RoelantStegmann commented 6 years ago

Hi,

After spending 1h on a cell, my livy ends with a 400 error.

An error was encountered:
Invalid status code '400' from https://the-new-solvinity.azurehdinsight.net/livy/sessions/3/statements/17 with error payload: "requirement failed: Session isn't active'' 

After some digging I discovered that is the following config livy.server.session.timeout=3600000 (or less likely livy.server.session.state-retain.sec=3600000). I can change it in the Ambari UI of my HDinsight cluster, but of course I would like to configure it with sparkmagic. Is that possible?

%%configure -f
{"livy.server.session.timeout-check": "false", "livy.server.session.timeout": 7200000}

A side question: how can I see the value of livy & spark parameters while running my kernel? For example, intuitively I expected one of this to work. It is nice to check if config.json settings are working, or to look up defaults.

%%info executorCores
%%configure -i executorCores
tripuranenis commented 5 years ago

I also have a similar issue - I am trying to get all the rows from executing a sql query, as opposed to just the 2500 rows.

%%sql -o bookratingslist -n 1000000 select a.userid, a.isbn, a.rating, b.booktitle, c.location, c.age from bookratings.bx_book_ratings_csv a join bookratings.bx_books_csv b on upper(rtrim(a.isbn)) = upper(rtrim(b.isbn)) join bookratings.bx_users_csv c on c.userid = a.userid

An error was encountered: Invalid status code '400' from http://:8998/sessions/4/statements/1 with error payload: "requirement failed: Session isn't active."

Any thoughts on resolving this issue?

julioasotodv commented 5 years ago

Same issue here. It would be nice to be able to configure Livy timeouts from sparkmagic %%configure command

shaikmanu797 commented 4 years ago

Sparkmagic interacts with Livy via REST API as a client using requests library and only allow properties that are from /POST sessions payload to be configurable.

https://livy.apache.org/docs/latest/rest-api.html

Sparkmagic creates the session by sending HTTP POST request on /sessions endpoint.

https://github.com/jupyter-incubator/sparkmagic/blob/9b5e30cab7fd7f98efac692b233c71a6c0b9f8a3/sparkmagic/sparkmagic/livyclientlib/livysession.py#L137

https://github.com/jupyter-incubator/sparkmagic/blob/9b5e30cab7fd7f98efac692b233c71a6c0b9f8a3/sparkmagic/sparkmagic/livyclientlib/livyreliablehttpclient.py#L35-L36

Unfortunately, the livy server level properties cannot be overridden from a client request. Also, a change in server configuration requires a restart in most cases.

You need to modify your livy.conf file inorder to fix this https://github.com/apache/incubator-livy/blob/v0.7.0-incubating/conf/livy.conf.template#L51-L62

Refer: https://stackoverflow.com/questions/54220381/how-to-set-livy-server-session-timeout-on-emr-cluster-boostrap

MatKurianski commented 4 years ago

Same problem here, anyone figured out how to solve this?

PedroRossi commented 4 years ago

Hi everyone, until Livy 0.6.0 there was an issue (https://issues.apache.org/jira/browse/LIVY-547) that even if the session was active the session timeout would kill the session anyway and this was only fixed on version 0.7.0 (which was released recently), so you need to upgrade to 0.7.0 or apply the issue patch to Livy in order to avoid this issue.

@RoelantStegmann @shaikmanu797 regarding retrieving Livy configuration through sparkmagic, Livy must enable this on their REST API, I would recommend searching Livy's Jira Board and see if they have something regarding this topic and if they eventually add it, we can add it on sparkmagic.

Going to close this issue for now since it is a problem with Livy, but any problem or extra comments fell free to add it here. \o/

sanderklijsen commented 2 years ago

Same problem, any solutions?