plasmabio / plasma

Plasma is an e-learning Jupyter-based platform for data analysis
https://docs.plasmabio.org
BSD 3-Clause "New" or "Revised" License
42 stars 12 forks source link

Consolidate the logs section in the documentation #200

Closed jtpio closed 2 years ago

jtpio commented 2 years ago

Fixes #194

This PR consolidates the docs about the looking at logs:

image

pierrepo commented 2 years ago

For some reason, Docker container names are in the form jupyter-username- (with the minus sign at the end):

$ docker ps
CONTAINER ID   IMAGE                 COMMAND                  CREATED          STATUS          PORTS                       NAMES
9c6bb5efd392   m1meg_ghm_gwas:main   "/usr/local/bin/repo…"   29 minutes ago   Up 29 minutes   127.0.0.1:54320->8888/tcp   jupyter-stu-megm1-130-
74d5ad15b961   m1meg_ghm_gwas:main   "/usr/local/bin/repo…"   37 minutes ago   Up 37 minutes   127.0.0.1:54318->8888/tcp   jupyter-stu-megm1-148-
55e2514e05a9   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54315->8888/tcp   jupyter-stu-pir-113-
5e796fe57e5a   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54314->8888/tcp   jupyter-stu-pir-105-
4c3f012d1a30   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54313->8888/tcp   jupyter-stu-pir-108-
e1b982bb907b   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54311->8888/tcp   jupyter-stu-pir-112-
a8b48d7b542d   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54309->8888/tcp   jupyter-stu-pir-106-
f9a26b20bd1c   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54307->8888/tcp   jupyter-stu-pir-114-
ffdd40d13ab0   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54306->8888/tcp   jupyter-stu-pir-110-
58d780c2cdb6   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54305->8888/tcp   jupyter-stu-pir-115-
d098c14a2973   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54304->8888/tcp   jupyter-stu-pir-101-
e76ff89df670   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54303->8888/tcp   jupyter-stu-pir-104-
6507abb62f61   ueg27:master          "/usr/local/bin/repo…"   2 hours ago      Up 2 hours      127.0.0.1:54301->8888/tcp   jupyter-stu-pir-107-
6ea4b5a123a1   ueg27:master          "/usr/local/bin/repo…"   3 hours ago      Up 3 hours      127.0.0.1:54298->8888/tcp   jupyter-abridiernahmias-

Is this something expected?

pierrepo commented 2 years ago

Is is possible to look at the logs of a user container post-mortem (i.e. when the container is stopped)? It looks like docker ps -a does not show stopped user containers (only those which run repo2docker). For instance on a server used 3 days ago by 30 students, I have:

$ docker ps -a
CONTAINER ID   IMAGE                                 COMMAND                  CREATED         STATUS                      PORTS                       NAMES
5e0a449d57d0   l3meh-gh-tp-2022:HEAD                 "/usr/local/bin/repo…"   4 minutes ago   Up 4 minutes                127.0.0.1:49230->8888/tcp   jupyter-stu-megl3-100-
51d83b87d35d   quay.io/jupyterhub/repo2docker:main   "/usr/local/bin/entr…"   5 days ago      Exited (0) 5 days ago                                   affectionate_swirles
e8a3133db2cf   quay.io/jupyterhub/repo2docker:main   "/usr/local/bin/entr…"   6 days ago      Exited (0) 6 days ago                                   silly_shannon
adc31e7af2f0   quay.io/jupyterhub/repo2docker:main   "/usr/local/bin/entr…"   4 weeks ago     Exited (0) 4 weeks ago                                  hopeful_nash
0f015a800b80   quay.io/jupyterhub/repo2docker:main   "/usr/local/bin/entr…"   2 months ago    Exited (0) 2 months ago                                 optimistic_stonebraker
a6d67ac5cfd4   quay.io/jupyterhub/repo2docker:main   "/usr/local/bin/entr…"   2 months ago    Exited (0) 2 months ago                                 laughing_wu
b2a8c6c073dd   quay.io/jupyterhub/repo2docker:main   "/usr/local/bin/entr…"   3 months ago    Exited (0) 3 months ago                                 crazy_wilbur
e353d4280beb   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (1) 5 months ago                                 hungry_burnell
811868c09c4f   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (0) 5 months ago                                 laughing_ramanujan
117fb6d982ff   af2c8103e73a                          "/usr/local/bin/entr…"   5 months ago    Exited (1) 5 months ago                                 practical_blackwell
bba674b90557   af2c8103e73a                          "/usr/local/bin/entr…"   5 months ago    Exited (1) 5 months ago                                 stoic_chebyshev
a5c77854c416   c077ca79884e                          "/usr/local/bin/entr…"   5 months ago    Exited (1) 5 months ago                                 sharp_fermi
2c32bfa378a7   8a62dee4fc9e                          "/bin/sh -c 'conda e…"   5 months ago    Dead                                                    eloquent_bhaskara
c49e949da6f4   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (137) 5 months ago                               distracted_matsumoto
877f7387281b   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (1) 5 months ago                                 modest_turing
6dbc787bd11b   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (1) 5 months ago                                 brave_brattain
d6824e27ccf8   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (1) 5 months ago                                 unruffled_ardinghelli
4ff6af4ae3c8   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (1) 5 months ago                                 gracious_brattain
8b05ca4b34a4   jupyter/repo2docker:master            "jupyter-repo2docker…"   5 months ago    Exited (0) 5 months ago                                 confident_newton
06f10d5703ec   jupyter/repo2docker:master            "jupyter-repo2docker…"   16 months ago   Exited (0) 16 months ago                                magical_hermann
548587921bdc   jupyter/repo2docker:master            "jupyter-repo2docker…"   16 months ago   Exited (0) 16 months ago                                admiring_clarke
2d6476ecaf25   jupyter/repo2docker:master            "jupyter-repo2docker…"   18 months ago   Exited (0) 18 months ago                                intelligent_faraday

We currently see one active container for user jupyter-stu-megl3-100-. All other stopped containers are the ones used to build environments images.

jtpio commented 2 years ago

For some reason, Docker container names are in the form jupyter-username- (with the minus sign at the end):

Yes this looks related to the name specified here:

https://github.com/plasmabio/plasma/blob/d0b13b9450aa3e060691af801a3c2fe0b3c4cb00/tljh-plasma/tljh_plasma/__init__.py#L116

This was to accommodate the use of named servers, in case a user would run several servers at the same time. The simplest right now might be to just leave it as it is, to allow for named servers if some other deployment wants to enable them.

It looks like docker ps -a does not show stopped user containers (only those which run repo2docker).

I think this is because the remove option is explicitly set to True here:

https://github.com/plasmabio/plasma/blob/d0b13b9450aa3e060691af801a3c2fe0b3c4cb00/tljh-plasma/tljh_plasma/__init__.py#L120

To avoid having lots of containers around when the servers are stopped.

pierrepo commented 2 years ago

Yes this looks related to the name specified here:

plasma/tljh-plasma/tljh_plasma/init.py

This was to accommodate the use of named servers, in case a user would run several servers at the same time. The simplest right now might be to just leave it as it is, to allow for named servers if some other deployment wants to enable them.

This is very reasonable. Do you think we could have a default value for servername (for instance server) to avoid having this lonely - at the end of the container name?

pierrepo commented 2 years ago

I think this is because the remove option is explicitly set to True here:

plasma/tljh-plasma/tljh_plasma/init.py

To avoid having lots of containers around when the servers are stopped.

Thanks for the explanation. This make sens. We should add a comment in the documentation telling that user containers are automatically cleaned after they're stopped.

But then how could we inspect user containers that stopped for unknown reason? Are we able to tell the spawner to remove user container that stopped with exit code 0 only?

jtpio commented 2 years ago

But then how could we inspect user containers that stopped for unknown reason? Are we able to tell the spawner to remove user container that stopped with exit code 0 only?

After quickly browsing the code, I'm not sure dockerspawner is able to make that distinction:

https://github.com/jupyterhub/dockerspawner/blob/main/dockerspawner/dockerspawner.py

Maybe there should instead be some kind of log retention in place to keep all the logs for a certain amount of time.

pierrepo commented 2 years ago

Maybe there should instead be some kind of log retention in place to keep all the logs for a certain amount of time.

Yes. We usually now very quickly when a server crashed. So a retention of a couple of days should be sufficient.

jtpio commented 2 years ago

PR preview available at: https://plasmabio--200.org.readthedocs.build/en/200/troubleshooting/index.html#looking-at-logs

Yes. We usually now very quickly when a server crashed. So a retention of a couple of days should be sufficient.

Let's track that in a different issue? There are different ways of doing, something simple would be to mount a volume in the user containers to it's available on the host.

pierrepo commented 2 years ago

Let's track that in a different issue?

Yes, sure.

PR preview available at: https://plasmabio--200.org.readthedocs.build/en/200/troubleshooting/index.html#looking-at-logs

Oups, something weird with the markdown:

image

jtpio commented 2 years ago

Good catch, fixed in https://github.com/plasmabio/plasma/pull/200/commits/24f8389fcf083b067d5cbdc5ebed9a4a94235452

pierrepo commented 2 years ago

This is now OK. I let you merge.

jtpio commented 2 years ago

Thanks.

Let's track that in a different issue? There are different ways of doing, something simple would be to mount a volume in the user containers to it's available on the host.

I opened https://github.com/plasmabio/plasma/issues/204 to keep track of this.