jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.56k stars 799 forks source link

Renaming a directory in a singleuser pod without dynamic storage can lead to data loss. #3501

Closed kkapper closed 2 months ago

kkapper commented 2 months ago

Bug description

Suppose you are mounting additional volumes to singleuser pods and not generating a PV on a per-user basis.

    storage:
      type: none
      extraVolumeMounts:
      - mountPath: /home/jovyan/private
        name: jupyterhub-private
        readOnly: false

(This is a common use case if you are using some kind of elastic file system or buckets to back your volume claims.)

A user can then in the UI attempt a rename on one of these directories.

The rename behavior seems to attempt to run a mv command, which will effectively empty the directory and move its contents to whatever the new folder name specified was.

Even if the UI claims the rename has failed, the source directory is still empty, and the contents have been moved to the new directory.

Because only the path specified is being maintained by the external storage provisioner, all data would be lost from the source directory.

How to reproduce

  1. Mount any additional drive to a singleuser pod without creating a PV. Something like this:
jupyterhub:
  singleuser:
    storage:
      type: none
      extraVolumeMounts:
      - mountPath: /home/jovyan/private
        name: jupyterhub-private
        existingClaim: private-test-volume
        readOnly: false
  1. Boot a singleuser pod and add some files to this directory.

  2. Rename the directory in the UI.

image

  1. Receive similar error:

image

  1. Notice files have been moved to the new directory

image

  1. Old directory (The only one with persistence) is now empty.

image

Expected behaviour

If the rename fails for any reason, the move process should probably just stop dead in its tracks.

What might be even better is allowing us to set permissions on directories from the jupyterhub side and just notify users that renaming these persistence backed directories is not allowed.

Actual behaviour

The rename errors, but all the files have been moved to another directory which is not backed by persistence.

Your personal set up

We use the standard helm chart here: https://github.com/jupyterhub/zero-to-jupyterhub-k8s

current chart version: 3.0.3

paste relevant logs here, if any

[W 2024-09-16 18:13:48.483 ServerApp] wrote error: "Unknown error renaming file: private [Errno 16] Device or resource busy: '/home/jovyan/private'"
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.10/shutil.py", line 816, in move
        os.rename(src, real_dst)
    OSError: [Errno 16] Device or resource busy: '/home/jovyan/private' -> '/home/jovyan/private-copy'

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/opt/conda/lib/python3.10/site-packages/jupyter_server/services/contents/filemanager.py", line 1050, in rename_file
        await run_sync(shutil.move, old_os_path, new_os_path)
      File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
        return await get_asynclib().run_sync_in_worker_thread(
      File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
        return await future
      File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
        result = context.run(func, *args)
      File "/opt/conda/lib/python3.10/shutil.py", line 834, in move
        rmtree(src)
      File "/opt/conda/lib/python3.10/shutil.py", line 731, in rmtree
        onerror(os.rmdir, path, sys.exc_info())
      File "/opt/conda/lib/python3.10/shutil.py", line 729, in rmtree
        os.rmdir(path)
    OSError: [Errno 16] Device or resource busy: '/home/jovyan/private'

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last):
      File "/opt/conda/lib/python3.10/site-packages/tornado/web.py", line 1786, in _execute
        result = await result
      File "/opt/conda/lib/python3.10/site-packages/jupyter_server/services/contents/handlers.py", line 151, in patch
        model = await ensure_async(cm.update(model, path))
      File "/opt/conda/lib/python3.10/site-packages/jupyter_core/utils/__init__.py", line 182, in ensure_async
        result = await obj
      File "/opt/conda/lib/python3.10/site-packages/jupyter_server/services/contents/manager.py", line 901, in update
        await self.rename(path, new_path)
      File "/opt/conda/lib/python3.10/site-packages/jupyter_server/services/contents/manager.py", line 888, in rename
        await self.rename_file(old_path, new_path)
      File "/opt/conda/lib/python3.10/site-packages/jupyter_server/services/contents/filemanager.py", line 1054, in rename_file
        raise web.HTTPError(500, f"Unknown error renaming file: {old_path} {e}") from e
    tornado.web.HTTPError: HTTP 500: Internal Server Error (Unknown error renaming file: private [Errno 16] Device or resource busy: '/home/jovyan/private')
manics commented 2 months ago

This is outside the control of Z2JH or JupyterHub. JupyterHub starts the singleuser container, but everything that happens inside that container is controlled by jupyter-server, JupyterLab, or whatever you've installed in your container.

If you think you've found a bug in jupyter-server or JupyterLab can you open an issue on the relevant repository? Thanks!

github-actions[bot] commented 2 months ago

Hi there @kkapper :wave:!

I closed this issue because it was labelled as a support question.

Please help us organize discussion by posting this on the https://discourse.jupyter.org/ forum. If it's your first time posting please read https://discourse.jupyter.org/t/getting-good-answers-to-your-questions/1825. The more information you provide the more likely we can help you.

Our goal is to sustain a positive experience for both users and developers. We use GitHub issues for specific discussions related to changing a repository's content, and let the forum be where we can more generally help and inspire each other.

Thanks you for being an active member of our community! :heart: