jupyter / help

:sparkles: Need some help or have some questions? Please visit our Discourse page.
https://discourse.jupyter.org
291 stars 97 forks source link

Jupyterhub issue #378

Closed Lowerit closed 6 years ago

Lowerit commented 6 years ago

Jupyterhub Troubleshooting

I’ve been troubleshooting this for roughly a total of 20~+ hours, searching for answers, trying different things, reading the proxy logs, and etc. I started at a new job, and began updating the python packages, only to be greeted by the jupyterhub server going down. So I didn’t set any of it up and I’ve just been trying to reverse engineer the whole schema and process. Any help would be greatly, greatly, appreciated!

My problem:

I was updating all the python packages in an azure ubuntu server ... When jupyterhub updated from 0.7.~ to 0.9.0, the server stopped working.

So, when I ran the update, I ran it from my home/'username'/file, and jupyterhub happened to be located in the /etc/~ folder on the server.

When the update occurred, it then created in my home/’username’/~ folder, the cookie secret and sqlite files.

When I currently try to run jupyterhub command I get this File:

1.) "/usr/local/lib/python3.5/dist-packages/tornado/netutil.py", line 163, in bind_sockets sock.bind(sockaddr) OSError: [Errno 98] Address already in use

So here is exactly when the first error occurred, and the errors that were thrown:

1.) [I 2018-06-21 13:08:29.404 JupyterHub log:100] 200 GET /jupyterhub/hub/api/authorizations/cookie/jupyter-hub-token-‘username’[secret]

2.) [E 2018-06-21 13:09:32.410 ‘username’ web:1590] Uncaught exception GET /jupyterhub/user/‘username’/tree/crawler/analysis/location (my_computers_ip)

3.) The errors produced jinja2.exceptions.TemplateSyntaxError: Encountered unknown tag 'trans'. Jinja was looking for the following tags: 'elif' or 'else' or 'endif'. The innermost block that needs to be closed is 'if'. jinja2.exceptions.TemplateNotFound: 500.html

Fast forward to 2 hours ago [config proxy] 200 GET api/routes was running fine, and now they are not. I assume it is because I tried to edit the jupyterhub_config.py file to see if it would make a difference. Here is the main error thrown from this.

1.) jinja2.exceptions.TemplateNotFound: error.html and now it's sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: proxies

So after this took place, the [proxy config] 200 GET stopped it’s fetchall requests.

My hypothesis is that the port used to relaunch the server was bugging out. Since the update occurred while the hub was still running and continued to “run” after all was said and done. I’m sure I will be asked for the config file, and I don’t really have one, as I am not the person who set the jupyterhub up to begin with, and the person who did is gone.

I am very happy to provide any sort of information possible, and take absolutely any advice! Thanks!

willingc commented 6 years ago

@Lowerit Check to see that you do not have a stale proxy still running?

What is the output of ps -ax | grep configurable?

Lowerit commented 6 years ago

2657 ? Ssl 2:03 node /usr/local/bin/configurable-http-proxy --ip --port 8000 --api-ip 127.0.0.1 --api-port 8001 --default-target http://127.0.0.1:8081 --error-target http://127.0.0.1:8081/jupyterhub/hub/error 38806 pts/12 S+ 0:00 grep --color=auto configurable

It seems like it?

Lowerit commented 6 years ago

Not really sure where to go from that :/

consideRatio commented 6 years ago

@Lowerit I think you should to shut down the proxy now.

The ps command @willingc recommended you to use combined with | grep configurable-http-proxy found a process running that was using the port that caused the OSError: [Errno 98] Address already in use.

So, shut down that process. By running ps -ax I learned by inspection that the first number is the process ID. Now, ask that process to shut down with kill -SIGTERM 2657 or forcefully with the -SIGKILL flag instaead.

Lowerit commented 6 years ago

Ok awesome thanks I'll try it.

Lowerit commented 6 years ago

I'm still getting that error afterward OSError: [Errno 98] Address already in use

davidbath commented 6 years ago

Looks like you need to track down the process that is listening on that port.

Another way you can debug this is to use netstat —listen and see if you can identify any jupyter and configurable-http-proxy instances that are still up and listening to ports. Manually kill them off as suggested above and when you are sure nothing is listening in those ports, try starting up jupyterhub again.

Lowerit commented 6 years ago

Hi, david, my primary concern is the other ports and users on the server. After killing that port it still wont load and I have a hunch its the :8081 port which is the issue, and I'm not sure how that will effect other users on the ubuntu server.

davidbath commented 6 years ago

Ok - so what does netstat report? Which process is using port 8081?

Lowerit commented 6 years ago

No direct 8081 processes, but multiple like this : 75597 ? S 0:00 links http://127.0.0.1:8081/hub/ 58263 ? S 0:00 links http://127.0.0.1:8081/hub/admin

. .... lots of those

davidbath commented 6 years ago

That doesn’t look like the output of netstat to me.

What does

sudo netstat -nlp | grep :8081

give you?

Lowerit commented 6 years ago

tcp 0 0 127.0.0.1:8081 0.0.0.0:* LISTEN 2280/python3

davidbath commented 6 years ago

Ok. So process 2280, which happens to be python, is listening on 8081 already.

If you don’t believe process 2280 to be an old copy of jupyterhub or something related to your work, use the ps command to figure out what process 2280 is, and if it is safe to kill or not.

Lowerit commented 6 years ago

2280 ? S 3:45 /usr/bin/python3 /usr/local/bin/jupyterhub --debug --JupyterHub.spawner_class=sudospawner.SudoSpawner

davidbath commented 6 years ago

Super! You are getting there!

So you have a copy of jupyterhub already started on port 8081, so when you try and startup a second one it fails to bind to the socket, as it’s already in use.

I’m guessing it’s reasonable to kill that off, unless someone else is running another jupyterhub on the same server as you!

Lowerit commented 6 years ago

thats one of my only concerns. I believe some others may be using jupyter notebooks. 54696 ? S 0:01 /usr/bin/python3 /usr/local/bin/jupyter-notebook 57389 ? T 0:00 sqlite3 jupyterhub.sqlite So i'm not sure if killing that will effect them.

davidbath commented 6 years ago

Ok - I’m a little confused :) if jupyterhub is up and running already and users are using it, spawning new notebooks etc then .... what’s broken?

Lowerit commented 6 years ago

Well, I guess it can only be used locally?

Lowerit commented 6 years ago

It all spawned from updating the jupyterhub, the errors are shown in the initial post up there.

Lowerit commented 6 years ago

So I guess maybe the proxy forwarding is messed up?

davidbath commented 6 years ago

What happens if you go to the hub address? Does it work?

Lowerit commented 6 years ago

Just from using links http://127.0.0.1:8000/hub/ . It shows up blank

davidbath commented 6 years ago

Cool. Well, if it’s broken now I would say your balance of risk to benefit means you should try killing it and starting it up again.

davidbath commented 6 years ago

Did you manage to determine if your configurable-http-proxy was still running? Same diagnostics as above, but a process that contains the name configurable-http-proxy?

Lowerit commented 6 years ago

So I killed the configurable-http-proxy. I will kill the other ones now, since it's 5PM and no one will be upset if I disrupt anything

davidbath commented 6 years ago

you can probably leave the notebooks running, as their routes should be readded after jupyterhub restarts. I think jupyterhub and configurable-http-proxy have got in a tangle during the accidental upgrade!

Lowerit commented 6 years ago

Ok so I just did some fact finding to make sure no one was using it, and I'm guessing that the 4 or 5 processes still running are all stale. I'm going to just kill all of them and try restarting.

Lowerit commented 6 years ago

Okay still getting the errors after killing all the processes.

[E 2018-06-27 17:09:28.078 JupyterHub app:1842] Failed to bind hub to http://127.0.0.1:8081/hub/

OSError: [Errno 98] Address already in use
Lowerit commented 6 years ago

So it seems like the configurable http proxy restarts after I try to launch jupyter hub, but that error still persists.

davidbath commented 6 years ago

Ok, so can you run the netstat command again and find out what process is using port 8081? It seems that you still have something else that is already using that port (or your process killing didn’t kill off everything).

And yes, in the default mode jupyterhub takes care of starting up configurable-http-proxy for you.

Lowerit commented 6 years ago

There is nothing on 8081 :/

Lowerit commented 6 years ago

However, quite a few 8000 and 8001

Lowerit commented 6 years ago

yeah lots of local host 8000 and 8001

Lowerit commented 6 years ago

tcp 0 0 localhost:35362 localhost:8001 TIME_WAIT

lots of stuff like that

Lowerit commented 6 years ago

Ok, I'm noticing something.. everytime I kill the configurable-http-proxy it restarts itself.

davidbath commented 6 years ago

Sure that’s not an old error then? I can’t quite see how the port can already be in use if you cannot see any process using it!

Lowerit commented 6 years ago

I'm not sure if you read my last comment about the http config proxy automatically restarting when i kill it

davidbath commented 6 years ago

This is jupyterhub keeping it managed, I think. If you kill jupyterhub and the configurable-http-proxy it should stay killed.

Lowerit commented 6 years ago

They keep restarting.

23373 ? S 0:00 /usr/bin/python3 /usr/local/bin/jupyterhub --debug --JupyterHub.spawner_class=sudospawner.SudoSpawner 23383 ? Ssl 0:00 node /usr/local/bin/configurable-http-proxy --ip 127.0.0.1 --port 8001 --api-ip 127.0.0.1 --api-port 8081 --error-target http://our-website-to-proxy-to:8000/jupyterhub/hub/error

Lowerit commented 6 years ago

So when it starts is using:

[I 2018-06-27 17:41:15.693 JupyterHub app:1656] Using Authenticator: jupyterhub.auth.PAMAuthenticator-0.9.0 [I 2018-06-27 17:41:15.694 JupyterHub app:1656] Using Spawner: jupyterhub.spawner.LocalProcessSpawner-0.9.0

Should not being using that authenticator?

davidbath commented 6 years ago

Ok - then I’m not quite sure why that is. I guess you must have something environment specific like a service manager that is attempting to maintain those services. Which one that is probably depends on which distro you are using and/or how the machine was set up initially.

However - as a quick check, when they have both auto-restarted, does it work?

Lowerit commented 6 years ago

So when I try to go to the website where you can usually login, it's giving me a http 500 error, unable to handle request

Lowerit commented 6 years ago

and the log file is telling me there isn't a template.

Lowerit commented 6 years ago

Yeah, it's saying no jinja template

Lowerit commented 6 years ago

jinja2.exceptions.TemplateNotFound: login.html

davidbath commented 6 years ago

Ok - so I think you are now into a config file problem. With two possible issues - (1) that a different config file is being loaded or (2) that you have some settings in your config file that are not compatible with the 0.9 version.

I’d try and track down which jupyterhub_config.py is being used, and see if you have multiple copies of it on your system. Given you didn’t set it up, you maybe able to work out which is the ‘right’ one by the date, or if you can work out which service manager is spawning jupyterhub this may specify which config file is being used.

Lowerit commented 6 years ago

There is only 1 config file on the server, in the /etc/ ,

Lowerit commented 6 years ago

Here is the error, it starts with the tornado

Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/tornado/web.py", line 1543, in _execute result = yield result File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result raise self._exception File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step result = coro.send(None) File "/usr/local/lib/python3.5/dist-packages/jupyterhub/handlers/login.py", line 72, in get self.finish(self._render(username=username)) File "/usr/local/lib/python3.5/dist-packages/jupyterhub/handlers/login.py", line 41, in _render {'next': self.get_argument('next', '')}, File "/usr/local/lib/python3.5/dist-packages/jupyterhub/handlers/base.py", line 777, in render_template template = self.get_template(name) File "/usr/local/lib/python3.5/dist-packages/jupyterhub/handlers/base.py", line 771, in get_template return self.settings['jinja2_env'].get_template(name) File "/usr/local/lib/python3.5/dist-packages/jinja2/environment.py", line 830, in get_template return self._load_template(name, self.make_globals(globals)) File "/usr/local/lib/python3.5/dist-packages/jinja2/environment.py", line 804, in _load_template template = self.loader.load(self, name, globals) File "/usr/local/lib/python3.5/dist-packages/jinja2/loaders.py", line 408, in load raise TemplateNotFound(name) jinja2.exceptions.TemplateNotFound: login.html

davidbath commented 6 years ago

Does your config file attempt to set any values for custom templates?

Lowerit commented 6 years ago

Paths to search for jinja templates.

c.JupyterHub.template_paths = []