microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
162.77k stars 28.73k forks source link

Linux: implement a different watcher that does not require file handles per file in the workspace folder #45295

Closed GregoryLundberg closed 2 years ago

GregoryLundberg commented 6 years ago

Steps to Reproduce:

  1. Run 'code' from command line or click menu item to run it.

Does this issue occur when all extensions are disabled?: Yes

Dialog message indicates the web page has help. It does not.

Bumped (soft) open file limit from 1024 to 4096: no help.

natlibfi-arlehiko commented 5 years ago

Also experiencing this on Ubuntu 18.04.

jhanschoo commented 5 years ago

To help with diagnosing this issue, I came up with this script that prints files being watched in a workspace or folder (remember to change the workspace variable in this zsh script)

#!/usr/bin/env zsh
# this script prints a series of paragraphs:
# - each paragraph corresponds to an inotify instance
# - the first line prints the process owning that inotify instance
# - each line in the rest of the paragraph corresponds to a file being
#   watched by the inotify instance

workspace=${HOME}/src/writerite/frontend-react

# create an associative array of inodes to their paths in the workspace
typeset -A inode_map
# list all the files in the workspace, then pass all the paths to ls to obtain
# their corresponding inodes, then sort these lines to uniquify the inodes in case
# duplicate hard links are present; then for each line
find "$workspace" -exec ls -di {} + | sort -nu -k1,1 | while read line
do
  # split it into words by the word-separator, then initialize an array with these words
  line_arr=(${=line})
  # the first word is the inode, so assign it as the key, whereas the other words form the path
  inode_map[$line_arr[1]]="$line_arr[2,-1]"
done

# for each inotify instance (found through iterating through each
# file descriptor opened by each process and seeing if it is an
# inotify file descriptor)
find /proc/*/fd/* -type l -lname 'anon_inode:inotify' 2> /dev/null |
while read line
do
  # print the process name for later printing
  process=$(sed -e 's@^/proc/\([[:digit:]]*\)/fd/[[:digit:]]*@\1@' <<< $line)
  cat "/proc/${process}/cmdline"
  echo

  # for each path to the file descriptor
  echo $line |
  # modify it to a path to the file descriptor info file
  sed -e 's@/fd/@/fdinfo/@' |
  # and print its contents
  xargs cat |
  # filter for the lines containing information about the files
  # watched in this instance (one line corresponds to one watch)
  grep inotify |
  # extract the hex-encoded inode of the file being watched
  sed -e 's/^.*ino:\([[:xdigit:]]*\) .*$/0x\1/' |
  while read line
  do
    # if the inode corresponds to a file in the workspace
    if [[ $inode_map[$((line))] ]]
    then
      # echo the file path
      echo "$inode_map[$((line))]"
    fi
  done
  echo
done
jhanschoo commented 5 years ago

Related to this issue, my script also catches that VSCode's TypeScript support watches node_modules files as well, even if tsconfig.json only includes a subdirectory without the node_modules folder.

kid1412621 commented 5 years ago

@GregoryLundberg two things we can try. First, we have a setting "files.useExperimentalFileWatcher", can you try setting that to true, restart Code and then see if it happens still?

work for me! lifesaver

GitMensch commented 3 years ago

Was the experimental watcher activated by files.useExperimentalFileWatcher dropped or made the non-changeable default?

cha0s commented 3 years ago

It doesn't work. I tried files.useExperimentalFileWatcher, but the issue persists. See: https://github.com/microsoft/vscode/issues/40898

bengtj commented 3 years ago

As this still cause lots of issues for us (large repo) after setting kernel limits to max I wonder if the Linux possibilities has been investigated properly. I'm no expert at inotify() and I agree that there is probably no recursive flag like bWatchSubtree like for Windows. But to me it looks like there could be a workaround by watching directories instead.

From "man inotify": When a directory is monitored, inotify will return events for the directory itself, and for files inside the directory.

Using this would probably mean some framework where when you want to add a file watch you would instead add an entry into the parent directory watcher that will handle forwarding for file events. But to me it sounds doable. (Not knowing vscode internals at all.)

Seems like there are existing wrappers for making inotify recursive, for example: https://github.com/dzchoi/Inotify-wrapper-for-recursive-directories

On a related note; is there maybe already some method to get vscode to refresh directory contents whenever you expand a directory? This would be very useful to not have to remember to hit the refresh button for those folders that we are excluding from being watched. When a file is renamed or similar.

awilkins commented 3 years ago

From inotify(7) man page :

Inotify monitoring of directories is not recursive: to monitor subdirectories under a directory, additional watches must be created. This can take a significant amount time for large directory trees.

If monitoring an entire directory subtree, and a new subdirectory is created in that tree or an existing directory is renamed into that tree, be aware that by the time you create a watch for the new subdirectory, new files (and subdirectories) may already exist inside the subdirectory. Therefore, you may want to scan the contents of the subdirectory immediately after adding the watch (and, if desired, recursively add watches for any subdirectories that it contains).

VSCode has 2 bits of code that mention inotify

The unix service (which uses chokidar) is marked deprecated (as is the WindowsWatcherService), so I presume the direction of travel (and maybe previously the "experimental" one) is nsfw.

Watcher creation is handled in diskFileSytemProvider.ts ; this confirms that the "legacy" watchers (Windows and Unix) are only used ...

The setup code passes a list of folders to the watcher.

nsfw as a package (and a very cursory flick over the code) seems to be aware that the right way to watch a directory tree in Linux is to put inotify watches on the folders and not the files, and similarly is aware that you use ReadDirectoryChangesW on Windows.

What nsfw doesn't seem to be aware of that Watchman does, is that the inotify events queue will overflow if there are a lot of events*. Users with high file update rates may wish to consider deepening /proc/sys/fs/inotify/max_queued_events. Watchman catches overflows and assumes every file in that folder changed if they occur.

What also doesn't seem to be the case is that (correct me if I'm wrong here devs... ) nsfw doesn't have a facility for ignoring subdirectories. Any paths you set in files.watcherExclude[] are handled in VSCode after events return from the watcher service - this means that globs that exclude folders that would never match like **/.git/objects/** or **/node_modules/** still return events from nsfw which are then subsequently ignore, whereas you could actually just never register a inotify user watch on these folders. Whether this improves performance a great deal I don't have numbers to support, but I'd expect that marshalling big wodges of events from a C++ program to be handled by a JS program is probably less efficient than just never doing that.

My suggestion for an improvement would be to put a callback in nsfw that offers up every folder that it's intending to watch to the client code, and allow the client to cancel that watch before establishing it, or to just add an extra event type to the existing callback that does this.


* This assumption is based entirely on it lacking the string "overflow" in it's code tree. If

bpasero commented 2 years ago

We will be shipping a new file watcher (https://github.com/microsoft/vscode/issues/132483) starting in insiders tomorrow that allows to exclude folders natively from watching via the files.watcherExclude setting. This has the advantage that you will not be running into file handle issues on Linux when you exclude large folders that need to be recursively watched.

I do not see a solution on Linux to watch a folder recursively without requiring a file handle per folder in the tree, so I will go ahead and close this issue. Recursive watching is a requirement of VS Code that we cannot lift because extensions rely on file events from any file within the opened workspace.