nebari-dev / nebari

🪴 Nebari - your open source data science platform
https://nebari.dev
BSD 3-Clause "New" or "Revised" License
283 stars 93 forks source link

[BUG] - jhub-apps entering loop of failed requests #2861

Open viniciusdc opened 6 days ago

viniciusdc commented 6 days ago

Describe the bug

@marcelovilla found an issue with the latest release (2024.11.1). When users visit /hub, they get stuck in a loop of failed requests:

https://github.com/user-attachments/assets/7ab1c099-4598-44ba-aa40-fca059fdf196

Examining the browser’s console logs reveals that all requests to services/japps/user and services/japps/spawner-profiles are returning 401 Unauthenticated errors. WhatsApp Image 2024-11-22 at 19 53 13

After a quick debugging, the error seems to originate from the current docker images, relabeling and pushing new Docker images from the previous release should resolve the issue for that version. -- It’s possible that the build-and-push action targeted the wrong branch during the build process.

We observed the same behavior on both GCP and AWS deployments using the 2024.11.1 images. I was also able to reproduce the issue on a local deployment from main, which means that the problem isn’t limited to this release and might be related to the update of jhub-apps to 2024.10.1.

Required actions:

Expected behavior

user should be able to access the /hub pages successfully when using jhub-appsUser

OS and architecture in which you are running Nebari

Linux

How to Reproduce the problem?

Deploy nebari on local using 2024.11.1 or main, since this is likely a problem with the images, just using the current image in any recent release of nebari might also trigger the problem.

Command output

No response

Versions and dependencies used.

No response

Compute environment

None

Integrations

No response

Anything else?

After checking the git history for the Docker images and the Nebari source code, the only significant change was upgrading jupyterlab from version 4.2.5 to 4.2.6.

To see the differences between the image versions, I used diffoci with this command:

diffoci diff quay.io/nebari/nebari-jupyterhub:2024.9.1 quay.io/nebari/nebari-jupyterhub:2024.11.1 --platform=linux/amd64 > report.diff

The same diff reported the same version of jhub-apps in both images, so I am not entirely sure how the deployment ended up in such state.

File     opt/conda/lib/python3.9/site-packages/jhub_apps-2024.8.1.dist-info/REQUESTED                                                                                         2024-09-27 19:08:57 -0300 -03                                                                                                                                               2024-11-21 12:40:24 -0300 -03

Also, the actual build logs for both images corroborate the above:

2024.11.1

25 81.91 jhsingle-native-proxy     0.8.2                    pypi_0    pypi
+ 25 81.91 jhub-apps                 2024.8.1                 pypi_0    pypi
25 81.91 jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
25 81.91 json5                     0.9.25                   pypi_0    pypi
25 81.91 jsonpointer               3.0.0            py39hf3d152e_1    conda-forge
25 81.91 jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
25 81.91 jsonschema-specifications 2023.12.1          pyhd8ed1ab_0    conda-forge
25 81.91 jsonschema-with-format-nongpl 4.23.0               hd8ed1ab_0    conda-forge
+ 25 81.91 jupyter                   1.1.1                    pypi_0    pypi
25 81.91 jupyter-client            8.6.3                    pypi_0    pypi
25 81.91 jupyter-console           6.6.3                    pypi_0    pypi
25 81.91 jupyter-core              5.7.2                    pypi_0    pypi
25 81.91 jupyter-lsp               2.2.5                    pypi_0    pypi
25 81.91 jupyter-server            2.14.2                   pypi_0    pypi
25 81.91 jupyter-server-terminals  0.5.3                    pypi_0    pypi
25 81.91 jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
+ 25 81.91 jupyterhub                5.1.0              pyh31011fe_0    conda-forge
25 81.91 jupyterhub-base           5.1.0              pyh31011fe_0    conda-forge
25 81.91 jupyterhub-idle-culler    1.2.1              pyhd8ed1ab_0    conda-forge
25 81.91 jupyterhub-kubespawner    4.2.0              pyhd8ed1ab_0    conda-forge
+ 25 81.91 jupyterlab                4.2.5                    pypi_0    pypi
25 81.91 jupyterlab-pygments       0.3.0                    pypi_0    pypi
25 81.91 jupyterlab-server         2.27.3                   pypi_0    pypi
25 81.91 jupyterlab-widgets        3.0.13                   pypi_0    pypi

2024.9.1

19 80.77 jhsingle-native-proxy     0.8.2                    pypi_0    pypi
+ 19 80.77 jhub-apps                 2024.8.1                 pypi_0    pypi
19 80.77 jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
19 80.77 json5                     0.9.28                   pypi_0    pypi
19 80.77 jsonpointer               3.0.0            py39hf3d152e_1    conda-forge
19 80.77 jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
19 80.77 jsonschema-specifications 2024.10.1          pyhd8ed1ab_0    conda-forge
19 80.77 jsonschema-with-format-nongpl 4.23.0               hd8ed1ab_0    conda-forge
+ 19 80.77 jupyter                   1.1.1                    pypi_0    pypi
19 80.77 jupyter-client            8.6.3                    pypi_0    pypi
19 80.77 jupyter-console           6.6.3                    pypi_0    pypi
19 80.77 jupyter-core              5.7.2                    pypi_0    pypi
19 80.77 jupyter-lsp               2.2.5                    pypi_0    pypi
19 80.77 jupyter-server            2.14.2                   pypi_0    pypi
19 80.77 jupyter-server-terminals  0.5.3                    pypi_0    pypi
19 80.77 jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
+ 19 80.77 jupyterhub                5.1.0              pyh31011fe_0    conda-forge
19 80.77 jupyterhub-base           5.1.0              pyh31011fe_0    conda-forge
19 80.77 jupyterhub-idle-culler    1.2.1              pyhd8ed1ab_0    conda-forge
19 80.77 jupyterhub-kubespawner    4.2.0              pyhd8ed1ab_0    conda-forge
+ 19 80.77 jupyterlab                4.2.6                    pypi_0    pypi
19 80.77 jupyterlab-pygments       0.3.0                    pypi_0    pypi
19 80.77 jupyterlab-server         2.27.3                   pypi_0    pypi
19 80.77 jupyterlab-widgets        3.0.13                   pypi_0    pypi
marcelovilla commented 4 days ago

@aktech do you have any idea about what might be going on here?

aktech commented 4 days ago

This is due to the latest release of PyJWT being incompatible with older version, which broke the authentication and hence infinite login loop. See: https://github.com/nebari-dev/jhub-apps/pull/510#issuecomment-2482826642

It was fixed last week here: https://github.com/nebari-dev/jhub-apps/pull/532

I'll create new release for jhub-apps now, to tackle this in nebari, this would also require a new release of docker images.

viniciusdc commented 4 days ago

Thanks @marcelovilla @aktech!

aktech commented 3 days ago

The new release is out now (had to iron out some UI bugs): https://pypi.org/project/jhub-apps/2024.11.1/ PR for docker image update: https://github.com/nebari-dev/nebari-docker-images/pull/189

viniciusdc commented 3 days ago

The above PR is merged and can be used by referencing the image tags to main in any deployment. I attested that the above error was fixed, but I didn't review all changes in jhub-apps while doing so. Overall, the new release addressed the issue of the main code.

As for the previous release, I think the best course of action is to just re-tag the 2024.9.1 docker images with the 2024.11.1 as well, since if we re-build now cherrying picking the fix we will include the bump 2024.8.1 -> 2024.11.1 of jhub-apps in the hotfix, which in my opinion is not good.

viniciusdc commented 3 days ago

I just moved the nebari-jupyterhub:2024.11.1 tag from sha256:9a4d16eb8acbebaa320d09935b2380d56ae295f7965a92924981f66aae83a4ee to sha256:ad840d1d69bfceb8424f22fc1978d554d00daed0ffef3a73e9b927367da6d5dc (original 2024.9.1) this should completely address the issue for the 2024.11.1 deployments.