Open Consultante-yr opened 1 year ago
How do you perform the HTTP request ? Do you use curl
? If you use curl
, do you use the --negociate
option ?
I use the command as mentioned in the README
Create a session :
curl -k -u : --negotiate -X POST https://livy_server:8998/sessions \ -d '{"kind": "pyspark"}' -H 'Content-Type: application/json'
Get the state session curl -k -u : --negotiate -X GET https://livy_server:8998/sessions
Do you have the proxyuser for livy in kms-site.xml?
hadoop.kms.proxyuser.livy.groups: '*'
hadoop.kms.proxyuser.livy.hosts: '*'
I added this two lines and now i have state dead this is all the message:
{"from":0,"total":2,"sessions":[{"id":0,"name":null,"appId":null,"owner":"livy","proxyUser":"livy","state":"dead","kind":"pyspark","appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":["\tat java.security.AccessController.doPrivileged(Native Method)","\tat javax.security.auth.Subject.doAs(Subject.java:422)","\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)","\tat org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:172)","\tat org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:217)","\tat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)","\tat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)","2023-03-06 19:04:30,194 INFO util.ShutdownHookManager: Shutdown hook called","2023-03-06 19:04:30,194 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7dd347a1-18b9-419a-b15b-ec33d0f463ae","\nYARN Diagnostics: "]},{"id":1,"name":null,"appId":null,"owner":"spark","proxyUser":"spark","state":"error","kind":"pyspark","appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":["\tat java.security.AccessController.doPrivileged(Native Method)","\tat javax.security.auth.Subject.doAs(Subject.java:422)","\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)","\tat org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:172)","\tat org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:217)","\tat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)","\tat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)","2023-03-06 19:08:26,769 INFO util.ShutdownHookManager: Shutdown hook called","2023-03-06 19:08:26,770 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3dd23288-2bb3-47c9-8b28-ddf7918d10c2","\nYARN Diagnostics: "]}]}
Does pyspark command work without livy?
yes pyspark work fine But I don't knox if it can be related to this error : https://github.com/TOSIT-IO/tdp-getting-started/issues/255
Following the log, the owner is livy and the proxyUser is Livy. What is the user that you do the kinit? How did you apply the change for proxyUser in kms-site? If it's possible, please post all the commands step by step including the kinit command. I dont think that the error is related to #255 Do you pull the last commit of tdp-collection-extras?
I added this in kms-site.xml
and i tried under the livy server:
sudo su - tdp_user kinit -ki
curl -k -u : --negotiate -X POST https:/livy_server:8998/sessions \ -d '{"kind": "pyspark"}' -H 'Content-Type: application/json'
curl -k -u : --negotiate -X GET https://livy_server:8998/sessions
curl -k -u : --negotiate -X POST https://edge-01.tdp:8998/sessions/0/statements \ -d '{"code": "1 + 1"}' -H 'Content-Type: application/json'
curl -k -u : --negotiate -X GET https://livy-server:8998/sessions/0/statements/0
and i tried with livy user and with another users but is the same result
I suppose that ranger-kms has been restarted after the kms-site update. The log does not give too much information. All I see is the permission error.
Could you do this command on a "dead" session which is initiated by tdp_user ? curl --negotiate -u : "https://livy_server:8998/sessions/session_id/log" | python -m json.tool
Please attach the kms-site.xml, livy.conf and livy-client.conf
I tried with your command i saw a problem access to to the directory /user under hdfs. So i added this to hdfs-site.xml
and when i retest i get this message
"Exception in thread \"main\" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig. It appears that the timeline client failed to initiate because an incompatible dependency in classpath. If timeline service is optional to this client, try to work around by setting yarn.timeline-service.enabled to false in client configuration.",
I tried to add spark.hadoop.yarn.timeline-service.enabled false to spark-defaults.conf, but that change nothing.
livy.zip I attached the files here
You must configure the correct ranger policies for the user to have access to hdfs.
By the way, the livy.conf is not updated to the lastet modification which helps to interfact with Hive. Please update your git repo and redeploy.
I gived my user all the right in the policies hdfs already, but it was not enough I had to add dfs.permissions false to hdfs-site.xml
I tried to install the last livy role in the project but northing change i have always this error "Exception in thread \"main\" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig. It appears that the timeline client failed to initiate because an incompatible dependency in classpath. If timeline service is optional to this client, try to work around by setting yarn.timeline-service.enabled to false in client configuration.",
As you know i cann't update all the project right now until you give the first version of the project, so i tried to fix this error without modify too much our current stable version of project. So do you have an idea about this error please?
What did you do to get this error "Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig. It appears that the timeline client failed to initiate because an incompatible dependency in classpath. If timeline service is optional to this client, try to work around by setting yarn.timeline-service.enabled to false in client configuration.",
Do you restart services after changement the configuration?
Could you redo all step with the tdp_user?
It will be easier for me to follow if you post the command and the result/error at the same time.
It's the same commands always:
sudo su - tdp_user kinit -ki
Create a session curl -k -u : --negotiate -X POST https:/livy_server:8998/sessions -d '{"kind": "pyspark"}' -H 'Content-Type: application/json'
Get the session status (wait until it is "idle") curl -k -u : --negotiate -X GET https://livy_server:8998/sessions
Submit a snippet of code to the session curl -k -u : --negotiate -X POST https://edge-01.tdp:8998/sessions/0/statements -d '{"code": "1 + 1"}' -H 'Content-Type: application/json'
Get the statement result curl -k -u : --negotiate -X GET https://livy-server:8998/sessions/0/statements/0
and i tried with livy user and with another users but is the same result
and after each modification, i restart the services.
Hello Team,
Under livy_server, after created a session, i tried to get the session status but i get status =error and the msg is:
Authentication failed, URL: https://152101lp38.csh-dijon.cnamts.fr:9393/kms/v1/?op=GETDELEGATIONTOKEN&doAs=tdp_user&renewer=rm%2F152101lp38.csh-dijon.cnamts.fr%40CNAMTS.DEV&user.name=livy, status: 403, message: Forbidden","\tat org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:401)","\tat org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:74)","\tat org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:147)","\tat org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:213)","\t... 49 more","2023-03-01 14:43:00,239 INFO util.ShutdownHookManager: Shutdown hook called","2023-03-01 14:43:00,240 INFO util.ShutdownHookManager:
the error mentions the kerberos ticket, however, I do have a valid kerberos ticket, did I miss something please?
Thank you in advance