sublimehq / sublime_text

Issue tracker for Sublime Text
https://www.sublimetext.com
802 stars 38 forks source link

Sublime UI is slow when working with files on network file systems on Linux #2089

Open nh2 opened 6 years ago

nh2 commented 6 years ago

Dear Sublime team,

Sublime's performance is great when all opened files are on local file systems.

However, when having files open that reside on file systems that do not provide low-latency access (thus especially on networked file systems, such as NFS, SMB shares, GlusterFS, sshfs and so on), some operations that Sublime does on these files can freeze the entire UI for many seconds. The freeze includes all other tabs that are not backed by a slow file system, the menu bar, all other Sublime windows, and so on.

Similar symptomps happen with other parts of Sublime. On a normal buffer, the Command Palette (Shift+Ctrl+P) opens instantaneously. If a file backed by a network file system is open in the buffer, it takes 3 seconds for the palette to appear.

This is unfortunate, because in many other cases, Sublime already seems to take great care that it never freezes the UI, including when opening large files, which blocks only the corresponding tabs and does not freeze the rest of the UI.


Having worked quite a bit with networked file systems, I hope I can provide some useful suspicion on why this may happen:

The design of POSIX system calls predated network file systems. While open(), read(), write() and similar syscalls have non-blocking alternatives, POSIX and Linux assume that some operations like stat() always return instantly (e.g. sub-microsecond on my local ext4 file system). As a result, they provide no way to do e.g. asynchronous/non-blocking stat() syscalls. However, on network file systems, the assumption that stat() is fast is not the case. As a result, a stat over e.g. sshfs can easily block the executing thread for seconds.

If you just roamed between two WiFi networks and sshfs, it can block (and thus, freeze Sublime entirely) for forever (or, practically, until you kill sshfs or the TCP timeout kicks in after the default of 3600 seconds). I've experienced this hundreds of times, and it is very painful. I would much prefer if only the affected tab buffer would block, and if I could simply close it as needed.

Because there are no non-blocking equivalents of syscalls like stat(), the only solution I'm aware of is to stat() in a separate thread.

I suspect that doing the stats off the main UI thread would allow Sublime to not freeze in these cases.

Thank you!

nh2 commented 6 years ago

For further info, I have attached the output of strace -fp $(pidof sublime_text_3) -ttt -T, during the time of the 3 seconds in which the UI freezes while opening the Command Palette:

sublime-ui-hang-command-palette-strace-output.txt

The first line,

1512657675.585284 poll(...

is the last syscall from the time I held down CtrlShift (to wait a bit before I press P to open the pallete). The next lines 4 seconds later are just when the P-key hits:

1512657679.687823 recvmsg(3, 0x7ffd2d8d6250, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000003>
1512657679.687940 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000049>
1512657679.688005 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000033>
1512657679.688049 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000028>
1512657679.688099 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000030>
1512657679.688140 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000027>
1512657679.688177 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000107>
1512657679.688318 futex(0x7fc54b7fb000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff) = 0 <0.134062>
1512657679.822412 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000014>
1512657679.822467 futex(0x7fc54b7fb000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff) = 0 <0.000053>
1512657679.822549 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000003>
1512657679.822578 futex(0x7fc54b7fb000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff) = 0 <0.000031>
1512657679.822620 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000003>
1512657679.822647 futex(0x7fc54b7fb000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff) = 0 <0.000028>
1512657679.822690 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000003>
1512657679.822717 futex(0x7fc54b7fb000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff) = 0 <0.000030>
1512657679.822758 futex(0x7fc54b80d000, FUTEX_WAKE, 1) = 1 <0.000003>
...

As I can see no stat() or similar syscalls in there, only futex() for ~3 seconds (until 1512657681.615927), I suspect that Sublime is using mmap() for files and that it is the memory accesses, proxied over the remote file system, that block the UI.

keith-hall commented 6 years ago

Something similar was also reported here: https://github.com/SublimeTextIssues/Core/issues/594 and on the forums: https://forum.sublimetext.com/t/long-lockups-when-using-remote-file-systems/30796

ibanks commented 5 years ago

Hello,

I have been experiencing a similar issue for a little while now (about several months) but today it is as worse as it's ever been. I have a Win10 machine that accesses files using Samba over our network. Now when I access a directory it takes almost 3 minutes to load the full directory. Is there any fix to this?

wbond commented 5 years ago

Just to be sure there aren't plugins clouding the issue, try testing with https://www.sublimetext.com/docs/3/revert.html. Unfortunately it is possible for plugins to block the main thread by doing IO without starting a background thread.

ibanks commented 5 years ago

@wbond Thank you for replying. Yesterday I tested it out by removing the "Sublime Text 3" folder which was suppose to remove the plugins and it still didn't work correctly. It's still loading directories through Samba very slow.

ibanks commented 5 years ago

@wbond This has been a huge irritant. I also do not have many packages installed either.

SudoMike commented 5 years ago

Same here. It's ridiculously slow. I have a Samba share on a LAN (so almost no latency), ST3, Ubuntu 18.04, and if I try to open a file and it starts in one of the shared directories with maybe 20 files in it, I have to wait about 30 seconds before it actually shows the files.

This is with a fresh state (i.e. I shutdown ST3, moved ~/.config/sublime-text-3 elsewhere, and ran ST3).

For comparison, "ls -l " takes 1.6s, so the info about the files in that directory can be gotten quickly. This is definitely something ST3 is doing wrong.

wbond commented 5 years ago

Same here. It's ridiculously slow. I have a Samba share on a LAN (so almost no latency), ST3, Ubuntu 18.04, and if I try to open a file and it starts in one of the shared directories with maybe 20 files in it, I have to wait about 30 seconds before it actually shows the files.

Yes, something else must be going on here. I just added a folder structure with 305 files and folders from a Samba share and everything was immediately available.

Do you see any errors in your Console when you open this troublesome directory?

For comparison, "ls -l " takes 1.6s, so the info about the files in that directory can be gotten quickly.

Honestly, 1.6s to list 20 files does not sound even mediocre, but quite slow. I presume that some sort of request is sent to the server, and then a response is generated (rather quickly) and the response sent back. That makes be believe that the transmission time is well over 1s (probably about 1.5s) per request. When we scan files and folder we get various information about them, such is if they are symlinks, and some other info. I wonder if the Samba FS layer on Ubuntu is sending one request per files. If it takes about 1.5s for a round trip request and you have 20 files, then that would equal around 30s. 🤔

SudoMike commented 5 years ago

Yeah, I looked into this a little more and I don't think this is specifically an ST3 problem. Sorry to imply that incorrectly.

It happens with any application that uses the file open dialog, so I suspect the issue is with that. I don't have any incriminating messages in my ST3 console.

For reference, on the server, the share is configured as:

[sharename] path = /home/me/thefolder valid users = me read only = no

On the client, I do:

smbclient -L //themachine/sharename -U me sudo mount -t cifs -o username=me //themachine/sharename ~/where_to_mount_locally

nh2 commented 5 years ago

You should use strace the way I mentioned earlier, it will immediately reveal what file system calls (and thus network calls) both ls and Sublime are doing, and how long they take.

I wonder if the Samba FS layer on Ubuntu is sending one request per files

If these requests are sent serially in Sublime's code, then there is no other way it could do it. Only if you launch multiple threads to start multiple FS operations in parallel will they not be run after each other.

I'd also be happy to answer any questions about strace, system calls and their blocking behaviour, or distributed FS behaviour in a video call if that helps pushing Sublime's handling of this futher.

qgates commented 5 years ago

In the last month or so I've begun seeing this issue running ST on Win10 connecting to a local Linux server via cabled lan. When I have a Laravel project folder open over samba in ST and hit ctrl+shift+p (command palette), there is a short lag before the palette pops up (half second). Now and then that lag becomes huge (30s+) and I get 'Not Responding' on the ST window for the duration. Eventually the palette appears and everything works again, but if I try to load the palette straight afterwards the problem usually repeats.

I've noticed this more in recent times because I'm using the palette more frequently, chiefly as I've switched to Typescript and lots of helpers are on the palette. So I suspect the problem's been around for longer.

I've done all the usual (portable ST, no plugins) but its a gamble using ST over SMB for development at present. I've also disabled telemetry, git, indexing and set 'ignore_inodes' to true/false, none of which helps. I've also noodled around with samba settings on the server side (running Ubuntu 18.04), again no dice.

Be great if this could be pushed up the priority pile, happy to assist with debugging.

skerit commented 5 years ago

The sidebar of a project of mine sometimes takes 5+ minutes to load. Performing a find . in the same directory with the terminal is magnitudes faster. Are the files indexed while the tree is being loaded? Could the tree be loaded first before any other stuff happens?

Trystanr commented 4 years ago

Unfortunately, this performance issue is still quite prevalent.

jondkelley commented 4 years ago

Very... people have this problem regularly. :( Wish NFS worked a lot better.

lurchpop commented 4 years ago

same thing happens when working over SFTP. Wish it would respec the folder_exclude_patterns. I have vendor folders with 10k files. Wish it would just pretend that dir doesn't exist.

wbond commented 4 years ago

folder_exclude_patterns does generally work. If you post your folder paths and the exclude patterns you are using we might be able to help.

lungdart commented 3 years ago

Still having the problem over here.

I have sublime->project in NFS->OpenVPN->Work machine. Even when I'm in the office and not using the VPN, I still experience slowdowns (But not as badly with OpenVPN in the mix).

Rockburner commented 1 year ago

Just another regular user of Sublime reporting this issue.

I'm using ST4 on multiple projects which are hosted on a remote server. I connect into the network via VPN (I have access via an OpenVPN and a 'PriTunl' connections). The remote hosts run Samba, and I map the folders onto my Ubuntu 22 system to edit the files.

I have 'index_files' set to false in my preferences.

It generally takes at least 5 minutes (if not longer) to get a folder to open.

At least a few times every day ST4 will completely hang, disabling every ST4 window, and occasionally preventing any mouse activity (depends on where the cursor was when ST4 hung).

Is there any other option for accessing the files I can try? (I'd prefer not to have the projects hosted locally)

lungdart commented 1 year ago

Update: I switched to VSCode. It's remote development workflow is much smoother. It took some time to adjust, but I'm happy with the change.

TomHarrop commented 1 year ago

I've been having the same issue with files on a cifs share on Ubuntu 22.04. Folders from the share are symlinked into a subdirectory of my project directory.

I added the remote folder names to folder_exclude_patterns and the predominant file types to binary_file_patterns and it seems to have gone away. No more lockups since I made the change.

Is there a way to globally ignore all remote files, or maybe not follow symlinks?