TOSIT-IO / tdp-getting-started

Vagrant / Ansible environment to deploy a local TDP cluster
Apache License 2.0
19 stars 24 forks source link

Error livy #254

Open Consultante-yr opened 1 year ago

Consultante-yr commented 1 year ago

Hello Team,

Under livy_server, after created a session, i tried to get the session status but i get status =error and the msg is:

Authentication failed, URL: https://152101lp38.csh-dijon.cnamts.fr:9393/kms/v1/?op=GETDELEGATIONTOKEN&doAs=tdp_user&renewer=rm%2F152101lp38.csh-dijon.cnamts.fr%40CNAMTS.DEV&user.name=livy, status: 403, message: Forbidden","\tat org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:401)","\tat org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:74)","\tat org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:147)","\tat org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:213)","\t... 49 more","2023-03-01 14:43:00,239 INFO util.ShutdownHookManager: Shutdown hook called","2023-03-01 14:43:00,240 INFO util.ShutdownHookManager:

the error mentions the kerberos ticket, however, I do have a valid kerberos ticket, did I miss something please?

Thank you in advance

rpignolet commented 1 year ago

How do you perform the HTTP request ? Do you use curl ? If you use curl, do you use the --negociate option ?

Consultante-yr commented 1 year ago

I use the command as mentioned in the README

Create a session :

curl -k -u : --negotiate -X POST https://livy_server:8998/sessions \ -d '{"kind": "pyspark"}' -H 'Content-Type: application/json'

Get the state session curl -k -u : --negotiate -X GET https://livy_server:8998/sessions

nschung commented 1 year ago

Do you have the proxyuser for livy in kms-site.xml?

hadoop.kms.proxyuser.livy.groups: '*'

hadoop.kms.proxyuser.livy.hosts: '*'

Consultante-yr commented 1 year ago

I added this two lines and now i have state dead this is all the message:

{"from":0,"total":2,"sessions":[{"id":0,"name":null,"appId":null,"owner":"livy","proxyUser":"livy","state":"dead","kind":"pyspark","appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":["\tat java.security.AccessController.doPrivileged(Native Method)","\tat javax.security.auth.Subject.doAs(Subject.java:422)","\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)","\tat org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:172)","\tat org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:217)","\tat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)","\tat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)","2023-03-06 19:04:30,194 INFO util.ShutdownHookManager: Shutdown hook called","2023-03-06 19:04:30,194 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7dd347a1-18b9-419a-b15b-ec33d0f463ae","\nYARN Diagnostics: "]},{"id":1,"name":null,"appId":null,"owner":"spark","proxyUser":"spark","state":"error","kind":"pyspark","appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":["\tat java.security.AccessController.doPrivileged(Native Method)","\tat javax.security.auth.Subject.doAs(Subject.java:422)","\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)","\tat org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:172)","\tat org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:217)","\tat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)","\tat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)","2023-03-06 19:08:26,769 INFO util.ShutdownHookManager: Shutdown hook called","2023-03-06 19:08:26,770 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3dd23288-2bb3-47c9-8b28-ddf7918d10c2","\nYARN Diagnostics: "]}]}

nschung commented 1 year ago

Does pyspark command work without livy?

Consultante-yr commented 1 year ago

yes pyspark work fine But I don't knox if it can be related to this error : https://github.com/TOSIT-IO/tdp-getting-started/issues/255

nschung commented 1 year ago

Following the log, the owner is livy and the proxyUser is Livy. What is the user that you do the kinit? How did you apply the change for proxyUser in kms-site? If it's possible, please post all the commands step by step including the kinit command. I dont think that the error is related to #255 Do you pull the last commit of tdp-collection-extras?

Consultante-yr commented 1 year ago

I added this in kms-site.xml

hadoop.kms.proxyuser.livy.groups * hadoop.kms.proxyuser.livy.hosts * hadoop.kms.proxyuser.livy.users *

and i tried under the livy server:

sudo su - tdp_user kinit -ki

Create a session

curl -k -u : --negotiate -X POST https:/livy_server:8998/sessions \ -d '{"kind": "pyspark"}' -H 'Content-Type: application/json'

Get the session status (wait until it is "idle")

curl -k -u : --negotiate -X GET https://livy_server:8998/sessions

Submit a snippet of code to the session

curl -k -u : --negotiate -X POST https://edge-01.tdp:8998/sessions/0/statements \ -d '{"code": "1 + 1"}' -H 'Content-Type: application/json'

Get the statement result

curl -k -u : --negotiate -X GET https://livy-server:8998/sessions/0/statements/0

and i tried with livy user and with another users but is the same result

nschung commented 1 year ago

I suppose that ranger-kms has been restarted after the kms-site update. The log does not give too much information. All I see is the permission error.

Could you do this command on a "dead" session which is initiated by tdp_user ? curl --negotiate -u : "https://livy_server:8998/sessions/session_id/log" | python -m json.tool

Please attach the kms-site.xml, livy.conf and livy-client.conf

Consultante-yr commented 1 year ago

I tried with your command i saw a problem access to to the directory /user under hdfs. So i added this to hdfs-site.xml

dfs.permissions false

and when i retest i get this message

"Exception in thread \"main\" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig. It appears that the timeline client failed to initiate because an incompatible dependency in classpath. If timeline service is optional to this client, try to work around by setting yarn.timeline-service.enabled to false in client configuration.",

I tried to add spark.hadoop.yarn.timeline-service.enabled false to spark-defaults.conf, but that change nothing.

Consultante-yr commented 1 year ago

livy.zip I attached the files here

nschung commented 1 year ago

You must configure the correct ranger policies for the user to have access to hdfs.

By the way, the livy.conf is not updated to the lastet modification which helps to interfact with Hive. Please update your git repo and redeploy.

Consultante-yr commented 1 year ago

I gived my user all the right in the policies hdfs already, but it was not enough I had to add dfs.permissions false to hdfs-site.xml

I tried to install the last livy role in the project but northing change i have always this error "Exception in thread \"main\" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig. It appears that the timeline client failed to initiate because an incompatible dependency in classpath. If timeline service is optional to this client, try to work around by setting yarn.timeline-service.enabled to false in client configuration.",

As you know i cann't update all the project right now until you give the first version of the project, so i tried to fix this error without modify too much our current stable version of project. So do you have an idea about this error please?

nschung commented 1 year ago

What did you do to get this error "Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig. It appears that the timeline client failed to initiate because an incompatible dependency in classpath. If timeline service is optional to this client, try to work around by setting yarn.timeline-service.enabled to false in client configuration.",

Do you restart services after changement the configuration?

Could you redo all step with the tdp_user?

It will be easier for me to follow if you post the command and the result/error at the same time.

Consultante-yr commented 1 year ago

It's the same commands always:

sudo su - tdp_user kinit -ki

Create a session curl -k -u : --negotiate -X POST https:/livy_server:8998/sessions -d '{"kind": "pyspark"}' -H 'Content-Type: application/json'

Get the session status (wait until it is "idle") curl -k -u : --negotiate -X GET https://livy_server:8998/sessions

Submit a snippet of code to the session curl -k -u : --negotiate -X POST https://edge-01.tdp:8998/sessions/0/statements -d '{"code": "1 + 1"}' -H 'Content-Type: application/json'

Get the statement result curl -k -u : --negotiate -X GET https://livy-server:8998/sessions/0/statements/0

and i tried with livy user and with another users but is the same result

and after each modification, i restart the services.