streamlit / streamlit

Streamlit — A faster way to build and share data apps.
https://streamlit.io
Apache License 2.0
35.2k stars 3.05k forks source link

`streamlit run` does not work when the python file is in root directory. #5239

Open ayushr2 opened 2 years ago

ayushr2 commented 2 years ago

Summary

Running something like streamlit run /foo.py crashes.

Steps to reproduce

$ pip install streamlit
$ sudo touch /foo.py
$ .local/bin/streamlit run /foo.py 
Traceback (most recent call last):
  File "/usr/local/google/home/ayushranjan/.local/bin/streamlit", line 8, in <module>
    sys.exit(main())
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/web/cli.py", line 205, in main_run
    _main_run(target, args, flag_options=kwargs)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/web/cli.py", line 240, in _main_run
    bootstrap.run(file, command_line, args, flag_options)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/web/bootstrap.py", line 360, in run
    _install_pages_watcher(main_script_path)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/web/bootstrap.py", line 336, in _install_pages_watcher
    watch_dir(
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/watcher/path_watcher.py", line 154, in watch_dir
    return _watch_path(
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/watcher/path_watcher.py", line 129, in _watch_path
    watcher_class(
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/watcher/event_based_path_watcher.py", line 92, in __init__
    path_watcher.watch_path(
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/streamlit/watcher/event_based_path_watcher.py", line 170, in watch_path
    folder_handler.watch = self._observer.schedule(
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/observers/api.py", line 302, in schedule
    emitter.start()
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/utils/__init__.py", line 93, in start
    self.on_thread_start()
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/observers/inotify.py", line 118, in on_thread_start
    self._inotify = InotifyBuffer(path, self.watch.is_recursive)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/observers/inotify_buffer.py", line 35, in __init__
    self._inotify = Inotify(path, recursive)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 167, in __init__
    self._add_dir_watch(path, recursive, event_mask)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 372, in _add_dir_watch
    self._add_watch(full_path, mask)
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 386, in _add_watch
    Inotify._raise_error()
  File "/usr/local/google/home/ayushranjan/.local/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 406, in _raise_error
    raise OSError(err, os.strerror(err))
FileNotFoundError: [Errno 2] No such file or directory

Expected behavior:

This crash should not happen since there is not documented constraints of where the python file can be placed.

Actual behavior:

Crashes as shown in reproducer.

Is this a regression?

I am not sure. Have not bisected across versions.

Debug info

Technical details of crash

The reason this crash happens is because streamlit tries to recursively add inotify watchers on everything inside the parent directory of the python file. In this case, the parent directory is root itself. As a result, streamlit attempts to add inotify watchers in /proc filesystem. This fails because watchdog/observers package uses a directory file descriptor to read the contents of each directory. At some point, it comes across /proc/<selfpid>/task/<selftid>/fd directory. getdents64(2) results on this directory naturally includes the directory file descriptor which was used for getdents64(2). Then the directory FD is closed. But watchdog package remembers the entry and tries to add inotify watcher on that entry. But that entry no longer exists because the directory FD has been closed. Here are some strace logs (which were collected using gvisor) which show this phenomenon happening:

...
strace.go:603] [   2:   2] streamlit E openat(AT_FDCWD /, 0x3eb19c6cf290 /proc/2/task/2/fd, O_RDONLY|O_CLOEXEC|O_DIRECTORY|O_NONBLOCK, 0o0)
strace.go:641] [   2:   2] streamlit X openat(AT_FDCWD /, 0x3eb19c6cf290 /proc/2/task/2/fd, O_RDONLY|O_CLOEXEC|O_DIRECTORY|O_NONBLOCK, 0o0) = 4 (0x4) (7.951µs)
strace.go:597] [   2:   2] streamlit E fstat(0x4 /proc/2/task/2/fd, 0x3e131b2c4c20)
strace.go:635] [   2:   2] streamlit X fstat(0x4 /proc/2/task/2/fd, 0x3e131b2c4c20 {dev=14, ino=135, mode=S_IFDIR|0o555, nlink=2, uid=0, gid=0, rdev=0, size=0, blksize=4096, blocks=0, atime=2022-08-22 21:56:35.52712683 -0700 PDT, mtime=2022-08-22 21:56:35.52712683 -0700 PDT, ctime=2022-08-22 21:56:35.52712683 -0700 PDT}) = 0 (0x0) (1.498µs)
strace.go:600] [   2:   2] streamlit E getdents64(0x4 /proc/2/task/2/fd, 0x29f049765960, 0x8000)
strace.go:638] [   2:   2] streamlit X getdents64(0x4 /proc/2/task/2/fd, 0x29f049765960, 0x8000) = 168 (0xa8) (5.062µs)
strace.go:597] [   2:   2] streamlit E stat(0x3eb19c6cfa50 /proc/2/task/2/fd/0, 0x3e131b2c4c00)
strace.go:635] [   2:   2] streamlit X stat(0x3eb19c6cfa50 /proc/2/task/2/fd/0, 0x3e131b2c4c00 {dev=26, ino=3, mode=S_IFCHR|0o666, nlink=1, uid=0, gid=0, rdev=0, size=0, blksize=4096, blocks=0, atime=2022-08-17 17:56:13.882378514 -0700 PDT, mtime=2022-08-17 17:56:13.882378514 -0700 PDT, ctime=2022-08-17 17:56:13.882378514 -0700 PDT}) = 0 (0x0) (14.659µs)
strace.go:597] [   2:   2] streamlit E stat(0x3eb19c6cfa90 /proc/2/task/2/fd/1, 0x3e131b2c4c00)
strace.go:635] [   2:   2] streamlit X stat(0x3eb19c6cfa90 /proc/2/task/2/fd/1, 0x3e131b2c4c00 {dev=26, ino=2, mode=S_IFIFO|0o666, nlink=1, uid=268582982, gid=5000, rdev=0, size=0, blksize=4096, blocks=0, atime=2022-08-22 21:56:22.962273084 -0700 PDT, mtime=2022-08-22 21:56:22.962273084 -0700 PDT, ctime=2022-08-22 21:56:22.962273084 -0700 PDT}) = 0 (0x0) (10.111µs)
strace.go:597] [   2:   2] streamlit E stat(0x3eb19c6cfa50 /proc/2/task/2/fd/2, 0x3e131b2c4c00)
strace.go:635] [   2:   2] streamlit X stat(0x3eb19c6cfa50 /proc/2/task/2/fd/2, 0x3e131b2c4c00 {dev=26, ino=1, mode=S_IFIFO|0o666, nlink=1, uid=268582982, gid=5000, rdev=0, size=0, blksize=4096, blocks=0, atime=2022-08-22 21:56:24.822271459 -0700 PDT, mtime=2022-08-22 21:56:24.822271459 -0700 PDT, ctime=2022-08-22 21:56:24.822271459 -0700 PDT}) = 0 (0x0) (9.985µs)
strace.go:597] [   2:   2] streamlit E stat(0x3eb19c6cfa90 /proc/2/task/2/fd/3, 0x3e131b2c4c00)
strace.go:635] [   2:   2] streamlit X stat(0x3eb19c6cfa90 /proc/2/task/2/fd/3, 0x3e131b2c4c00 {dev=1, ino=1, mode=0o600, nlink=1, uid=0, gid=0, rdev=0, size=0, blksize=4096, blocks=0, atime=1969-12-31 16:00:00 -0800 PST, mtime=1969-12-31 16:00:00 -0800 PST, ctime=1969-12-31 16:00:00 -0800 PST}) = 0 (0x0) (7.327µs)
strace.go:597] [   2:   2] streamlit E stat(0x3eb19c6cfa50 /proc/2/task/2/fd/4, 0x3e131b2c4c00)
strace.go:635] [   2:   2] streamlit X stat(0x3eb19c6cfa50 /proc/2/task/2/fd/4, 0x3e131b2c4c00 {dev=14, ino=135, mode=S_IFDIR|0o555, nlink=2, uid=0, gid=0, rdev=0, size=0, blksize=4096, blocks=0, atime=2022-08-22 21:56:35.52712683 -0700 PDT, mtime=2022-08-22 21:56:35.52712683 -0700 PDT, ctime=2022-08-22 21:56:35.52712683 -0700 PDT}) = 0 (0x0) (9.26µs)
strace.go:600] [   2:   2] streamlit E getdents64(0x4 /proc/2/task/2/fd, 0x29f049765960, 0x8000)
strace.go:638] [   2:   2] streamlit X getdents64(0x4 /proc/2/task/2/fd, 0x29f049765960, 0x8000) = 0 (0x0) (4.754µs)
strace.go:594] [   2:   2] streamlit E close(0x4 /proc/2/task/2/fd)
strace.go:632] [   2:   2] streamlit X close(0x4 /proc/2/task/2/fd) = 0 (0x0) (2.102µs)
strace.go:597] [   2:   2] streamlit E lstat(0x3eb19c6cf390 /proc/2/task/2/fd/4, 0x3e131b2c5330)
strace.go:635] [   2:   2] streamlit X lstat(0x3eb19c6cf390 /proc/2/task/2/fd/4, 0x3e131b2c5330) = 0 (0x0) errno=2 (no such file or directory) (5.88µs)
strace.go:600] [   2:   2] streamlit E inotify_add_watch(0x3, 0x3eb19c6cf390 /proc/2/task/2/fd/4, 0x20007ce)
strace.go:638] [   2:   2] streamlit X inotify_add_watch(0x3, 0x3eb19c6cf390 /proc/2/task/2/fd/4, 0x20007ce) = 0 (0x0) errno=2 (no such file or directory) (5.137µs)
...
willhuang1997 commented 2 years ago

Hi @ayushr2 , thanks for reporting this. I tried reproducing it on my mac system but I couldn't quite reproduce it. Is there a reason why you're putting in the root folder? That seems a little odd.

ayushr2 commented 2 years ago

@willhuang1997, this reproduces on Linux. If you don't have access to a Linux machine, you should be able to repro using Docker.

$ docker run --rm -it ubuntu bash
root@e5155b1e3e15:/# apt update
root@e5155b1e3e15:/# apt install -y python3-pip
root@e5155b1e3e15:/# pip install streamlit
root@e5155b1e3e15:/# touch /foo.py
root@e5155b1e3e15:/# streamlit run /foo.py
Traceback (most recent call last):
  File "/usr/local/bin/streamlit", line 8, in <module>
    sys.exit(main())
...
line 406, in _raise_error
    raise OSError(err, os.strerror(err))
FileNotFoundError: [Errno 2] No such file or directory

Is there a reason why you're putting in the root folder? That seems a little odd.

I work on gVisor. We have multiple customers who are running streamlit under this setting. We were investigating this failure and noted that this is not a gVisor bug, because it happens on Linux too. And that proc(5) filesystem is not compatible to be recursively watched using notify the way watchdog package does it.

So on streamlit side, we should probably avoid recursively watching into /proc directory (or any other procfs mounts).

kajarenc commented 2 years ago

Hello, @ayushr2, and thanks for the detailed response!

I think this is the same bug, we discussed here https://github.com/streamlit/streamlit/issues/4842

Yes, we not support running streamlit from the OS root directory since 1.10 .

We recommend specifying WORKDIR in Dockerfile https://docs.streamlit.io/knowledge-base/tutorials/deploy/docker

CC: @snehankekre it would be great to add a paragraph in documentation, that says that we don't support running streamlit from the root of OS

vdonato commented 2 years ago

Hm, so I think that even with a workaround, we'll want to fix this rather than add an item to our docs about it (although the docs item is necessary until a fix is merged).

I do think doing so is relative low priority given that we do have an easy workaround for it, though.

ayushr2 commented 2 years ago

Can the doc update be made in the interim, so people don't try to run streamlit in this way?

As for the fix, I guess it has to do with avoiding procfs mounts while adding watchers recursively.

snehankekre commented 2 years ago

CC: @snehankekre it would be great to add a paragraph in documentation, that says that we don't support running streamlit from the root of OS Yes, we do not support running streamlit from the OS root directory since 1.10 .

Does this issue affect Linux, macOS, and Windows, or just Linux distros?

snehankekre commented 2 years ago

In docs PR #482, I propose to add the following paragraph to Main concepts:

As of Streamlit version 1.10.0 and higher, Streamlit apps cannot be run from the root directory of Unix-like operating systems, including Linux and macOS. If you try to run a Streamlit app from the root directory, Streamlit will throw a FileNotFoundError: [Errno 2] No such file or directory error. For more information, see GitHub issue #5239.

If you are using Streamlit version 1.10.0 or higher, your main script should live in a directory other than the root directory. When using Docker, you can use the WORKDIR command to specify the directory where your main script lives. For an example of how to do this, read Create a Dockerfile.

image

I've also added a callout to our Docker and Kubernetes guides: image

Please let me know if these changes are accurate and sufficient or if it needs improving. cc @kajarenc @ayushr2

kajarenc commented 2 years ago

@snehankekre thank you very much, I think it looks good!

Another option to mitigate this issue could be to wrap this error and raise a more descriptive error message in core library in case fixing this completely would be problematic (i think that the most probably case in our situation)

CC: @vdonato

ayushr2 commented 2 years ago

As of Streamlit version 1.10.0 and higher, Streamlit apps cannot be run from the root directory of Unix-like operating systems, including Linux and macOS.

@snehankekre I think @willhuang1997 confirmed above (in the second comment) that he can't reproduce this on his MacOS system. I am only aware of this issue occurring on Linux.

snehankekre commented 2 years ago

I think @willhuang1997 confirmed above (in the second comment) that he can't reproduce this on his MacOS system. I am only aware of this issue occurring on Linux.

@willhuang1997 can you confirm you have write-access to the root directory on your macOS machine? Were you not able to repro because you have read-only access to root (and thus can't write a .py there), or is it the case you have write-access but are still not able to repro?

On my work macOS machine, I have read-only access to root. That's why I ask to confirm. Want to confirm whether this bug affects Unix-like operating systems or just Linux.

willhuang1997 commented 2 years ago

Hey @snehankekre , I only had read access on Mac and that's why it's only reproducible on Linux.

snehankekre commented 2 years ago

@ayushr2 @willhuang1997 Thanks for confirming. I was able to get a macOS VM running on my personal Linux box. Found out that since the release of Catalina, the root volume on macOS hasn't been writeable. As such, it doesn't make sense to include it in this note, as users will not be able to write .py files to root to begin with.

On #487, I've removed mentions of macOS from the note and say that running Streamlit apps from root isn't possible in Linux distributions.

mlanett commented 1 year ago

This sounds like a watchdog problem, not a streamlit problem.

I wouldn't expect anyone deploying via Docker (e.g. a read-only production-like situation) to be using watchdog.