jupyterhub / yarnspawner

Spawn JupyterHub single user notebook servers in Hadoop/YARN containers.
https://jupyterhub-yarnspawner.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
19 stars 16 forks source link

Issues after hdfs delegation token #16

Open f4nha opened 4 years ago

f4nha commented 4 years ago

Hi there I wonder if anyone can help me with this issue below, been stuck here for a while.

it is getting the hdfs delegation token but failed after that with this error, which I could not figure out where is coming from "skein.exceptions.DriverError: Failed to submit application, exception: java.lang.reflect.UndeclaredThrowableException"

Trace

[I 2020-01-28 17:55:54.384 JupyterHub log:174] 302 GET /hub/spawn -> /hub/spawn-pending/uk45002324 (MYUSER@MYLOCALIP) 1013.13ms
[I 2020-01-28 17:55:54.435 JupyterHub pages:303] MYUSER is pending spawn
[I 2020-01-28 17:55:54.438 JupyterHub log:174] 200 GET /hub/spawn-pending/MYUSER (MYUSER@MYLOCALIP) 29.46ms
20/01/28 17:55:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/01/28 17:55:54 INFO security.UserGroupInformation: Login successful for user hdfs-development05@example.com using keytab file /etc/security/keytabs/hdfs.headless.keytab
20/01/28 17:55:55 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
20/01/28 17:55:55 INFO client.AHSProxy: Connecting to Application History server at d05tapmana.example.com/1.1.1.1:10200
20/01/28 17:55:55 INFO skein.Driver: Driver started, listening on 45461
20/01/28 17:55:56 INFO client.AHSProxy: Connecting to Application History server at d05tapmana.example.com/1.1.1.1:10200
20/01/28 17:55:56 INFO client.RequestHedgingRMFailoverProxyProvider: Looking for the active RM in [rm1, rm2]...
20/01/28 17:55:56 INFO client.RequestHedgingRMFailoverProxyProvider: Found active RM [rm1]
20/01/28 17:55:56 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 247 for MYUSER on ha-hdfs:pphdp
[E 2020-01-28 17:55:56.439 JupyterHub spawner:216] Failed to submit application for user MYUSER. Original exception:
    Traceback (most recent call last):
      File "/opt/anaconda3/envs/jhub/lib/python3.8/site-packages/yarnspawner/spawner.py", line 212, in start
        self.app_id = app_id = await loop.run_in_executor(None, client.submit, spec)
      File "/opt/anaconda3/envs/jhub/lib/python3.8/concurrent/futures/thread.py", line 57, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/anaconda3/envs/jhub/lib/python3.8/site-packages/skein/core.py", line 509, in submit
        resp = self._call('submit', spec.to_protobuf())
      File "/opt/anaconda3/envs/jhub/lib/python3.8/site-packages/skein/core.py", line 289, in _call
        raise self._server_error(exc.details())
    skein.exceptions.DriverError: Failed to submit application, exception:
    java.lang.reflect.UndeclaredThrowableException

[E 2020-01-28 17:55:56.440 JupyterHub user:624] Unhandled error starting MYUSER's server: Failed to submit application, exception:
    java.lang.reflect.UndeclaredThrowableException
[D 2020-01-28 17:55:56.440 JupyterHub user:724] Stopping MYUSER
[D 2020-01-28 17:55:56.453 JupyterHub user:752] Deleting oauth client jupyterhub-user-MYUSER
[D 2020-01-28 17:55:56.463 JupyterHub user:755] Finished stopping MYUSER

My jupyter_config.py


c.JupyterHub.bind_url = 'http://:8000'
c.ConfigurableHTTPProxy.api_url = 'http://0.0.0.0:8001'
c.JupyterHub.cookie_secret_file = '/opt/jupyterhub/cookie_secret'
#c.JupyterHub.db_url = 'postgresql://dbuser:dbpass@db:5432/jupyterhub'
c.JupyterHub.db_url = 'postgresql+psycopg2://dbuser:dbuser@db:5432/jhub'
c.JupyterHub.hub_ip = 'serverip'

#YARN spawner
c.JupyterHub.spawner_class = 'yarnspawner.YarnSpawner'

c.YarnSpawner.principal = 'hdfs-dev05@example.com'
c.YarnSpawner.keytab = '/etc/security/keytabs/hdfs.headless.keytab'

#The YARN queue to use
c.YarnSpawner.queue = 'default'
#add starter as jupyterlab
c.YarnSpawner.default_url = '/lab'

c.YarnSpawner.cmd = '/opt/anaconda3/bin/python3 -m yarnspawner.singleuser'
f4nha commented 4 years ago

@jcrist can you help with this one? I manage to identify that skein is working fine sending jobs manually to yarn, but from yarnspanner is throwing this errors

[E 2020-01-31 11:45:15.747 JupyterHub spawner:216] Bad message (TypeError('not all arguments converted during string formatting')): {'name': 'JupyterHub', 'msg': 'Failed to submit application for user %s. Original exception:', 'args': (None, None, None, None, None, None, 'myuser'), 'levelname': 'ERROR', 'levelno': 40, 'pathname': '/opt/anaconda3/envs/jhub/lib/python3.8/site-packages/yarnspawner/spawner.py', 'filename': 'spawner.py', 'module': 'spawner', 'exc_info': (<class 'skein.exceptions.DriverError'>, DriverError('Failed to submit application, exception:\njava.lang.reflect.UndeclaredThrowableException'), <traceback object at 0x7f62eb87d9c0>), 'exc_text': None, 'stack_info': None, 'lineno': 216, 'funcName': 'start', 'created': 1580471115.7478108, 'msecs': 747.8108406066895, 'relativeCreated': 22086.684942245483, 'thread': 140063029651264, 'threadName': 'MainThread', 'processName': 'MainProcess', 'process': 28742}
    Traceback (most recent call last):
      File "/opt/anaconda3/envs/jhub/lib/python3.8/site-packages/yarnspawner/spawner.py", line 212, in start
        self.app_id = app_id = await loop.run_in_executor(None, client.submit, spec)
      File "/opt/anaconda3/envs/jhub/lib/python3.8/concurrent/futures/thread.py", line 57, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/anaconda3/envs/jhub/lib/python3.8/site-packages/skein/core.py", line 511, in submit
        resp = self._call('submit', spec.to_protobuf())
      File "/opt/anaconda3/envs/jhub/lib/python3.8/site-packages/skein/core.py", line 291, in _call
        raise self._server_error(exc.details())
    skein.exceptions.DriverError: Failed to submit application, exception:
    java.lang.reflect.UndeclaredThrowableException`
jcrist commented 4 years ago

Apologies for the delayed response here. I'm not sure what's going on here, and that exception isn't terribly useful. You say things were working fine when using skein manually? When you tried that, were you also specifying the principal and keytab and running as a proxy-user as you're doing here? This would look something like:

import skein

spec = skein.ApplicationSpec.from_yaml("""
name: test
queue: default
user: MYUSER
master:
  script: |
    echo 'Hello World'
""")

with skein.Client(
    principal="hdfs-dev05@example.com",
    keytab="/etc/security/keytabs/hdfs.headless.keytab",
    security=skein.Security.new_credentials(),
    log_level="DEBUG",
) as client:
    app_id = client.submit(spec)
    print(app_id)

I'd try running that - if it succeeds then there's something weird going on with yarnspawner. If it fails, the debug log-level should hopefully provide us with more information to work with.