microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
163.74k stars 29.1k forks source link

Frequently unable to save file or file data gets erased with error EBUSY #231542

Closed brycehutchings closed 1 day ago

brycehutchings commented 1 week ago

Type: Bug

I often switch between VS and VSCode and for the past month or two, VSCode has started to sometimes fail when I CTRL+S to save. It will show a message "Failed to save : Unable to write file ... (Unknown (FileSystemError): Error: EBUSY: resource busy or locked ...". In the editor the file shows as unsaved and intact, but actually the file has been cleared at this point and is empty.

If I click "Retry" it will still fail, but then I am given an option to "overwrite", and that will work and it will finally store what I had in my editor.

I have frequently been losing code or small changes due to this. If it matters, I am using Dev Drive for the files in question.

VS Code version: Code 1.94.2 (384ff7382de624fb94dbaf6da11977bba1ecd427, 2024-10-09T16:08:44.566Z) OS version: Windows_NT x64 10.0.26100 Modes:

System Info |Item|Value| |---|---| |CPUs|13th Gen Intel(R) Core(TM) i9-13900K (32 x 2995)| |GPU Status|2d_canvas: enabled
canvas_oop_rasterization: enabled_on
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_graphite: disabled_off
video_decode: enabled
video_encode: enabled
vulkan: disabled_off
webgl: enabled
webgl2: enabled
webgpu: enabled
webnn: disabled_off| |Load (avg)|undefined| |Memory (System)|127.81GB (83.21GB free)| |Process Argv|. --crash-reporter-id 11c0d54b-5322-4b67-bf1a-6b3385489ba7| |Screen Reader|no| |VM|0%|
Extensions: none
A/B Experiments ``` vsliv368cf:30146710 vspor879:30202332 vspor708:30202333 vspor363:30204092 vscod805cf:30301675 binariesv615:30325510 vsaa593cf:30376535 py29gd2263:31024239 c4g48928:30535728 azure-dev_surveyone:30548225 2i9eh265:30646982 962ge761:30959799 pythongtdpath:30769146 pythonnoceb:30805159 asynctok:30898717 pythonmypyd1:30879173 h48ei257:31000450 pythontbext0:30879054 accentitlementst:30995554 cppperfnew:31000557 dsvsc020:30976470 pythonait:31006305 dsvsc021:30996838 9c06g630:31013171 dvdeprecation:31068756 dwnewjupytercf:31046870 impr_priority:31102340 nativerepl2:31139839 refactort:31108082 pythonrstrctxt:31112756 wkspc-onlycs-t:31132770 wkspc-ranged-t:31151552 cf971741:31144450 autoexpandse:31146404 iacca2:31156134 notype1:31157159 controlgs:31153265 5fd0e150:31155592 icondisabled:31158250 ```
brycehutchings commented 1 week ago

This looks the similar to #230169 except I tried uninstalling all my extensions a few days ago to try to rule out an extension as the cause of the problem, but it is still happening.

bpasero commented 1 week ago

To me this sounds to be an issue with Dev Drive and/or its file system implementation. VS Code is relying on node.js for file operations which basically delegates to standard OS functions.

If you can still a reliable way to reproduce that would be a good start, but I doubt that a change in VS Code could be done to address this, I rather think it would end up being upstream, either in node.js or even for the team that runs Dev Drive as feedback.

pabermod commented 1 week ago

Hi, This problem seems to be happening also to me. I have disabled all extensions and yet i'm facing the same issue.

In mi case I have a .net 8 program with a FileSystemWatcher that reads the file when it is changed. But the problem occurs before the my program tries to read it. Randomly when I edit the file I have the error and the file gets emptied. This only happen with vs code. With Notepad++ this problem does not occur

brycehutchings commented 1 week ago

If you can still a reliable way to reproduce that would be a good start

I can't find a 100% reliable way to repro, but it happens very often so if there is data I can collect I should be able to capture it if I try enough times. I am beginning to suspect VS's file system watcher is somehow interfering with VSCode saving the file.

brycehutchings commented 1 week ago

[Edit: I think this analysis is wrong. See my next comment]

I turned on support for the ProcMon filter driver so I could observe what was going on with this issue on my Dev Drive. It looks like ServiceHub.Host.dotnet.x64.exe is opening the file with only "Read" for the ShareMode which blocks VSCode from saving the file. However, for unknown reasons (perhaps because of a Dev Drive bug?) the file is cleared out rather than left untouched. Because the file is cleared, VSCode thinks another process has changed the file and warns about losing changes.

Image At step (1) ServiceHub.Host.dotnet.x64.exe is opening the file for Read ShareMode only. At step (2) VSCode is failing to write the file At step (3) VSCode sees the file is now 0 bytes in size.

pabermod commented 1 week ago

In my case I have a regular Windows 10. Im not using Dev Drive

brycehutchings commented 1 week ago

Actually I missed the in that screenshot above, before step 1, VSCode is clearing the file contents (see "SetEndOfFileInformationFile" call with "EndOfFile: 0". Given @pabermod seeing this too, maybe Dev Drive is unrelated.

Here's an updated annotated screenshot of procmon: Image It shows VSCode observes the file is intact and then clears it. I do see multiple threads involved and wonder if there could be a race condition?

I did just try to repro this manually by writing a small program to open the cpp file with only "Read" share mode, but VSCode won't clear the file when I try to save. It gives the same EBUSY error message, but it repeats over and over again since the file isn't being cleared. So there's some "extra" to reproduce this problem.

bpasero commented 1 week ago

I am beginning to suspect VS's file system watcher is somehow interfering with VSCode saving the file.

If you configure a pattern of ** in files.watcherExclude, all file watching will be disabled, so thats a way to rule that out.

From my knowledge of how file writes work in node.js, every write is a truncate first followed by a write. Having a file be entirely empty is a sign to me that the truncation succeeds but the write then fails for some reason.

If you run code with --verbose from the command line, we do log a lot of stuff. That might reveal more as to where the error happens:

Please see if you can find any logged output that could help us find the source of the issue:

Thanks!

brycehutchings commented 1 week ago

I am beginning to suspect VS's file system watcher is somehow interfering with VSCode saving the file.

If you configure a pattern of ** in files.watcherExclude, all file watching will be disabled, so thats a way to rule that out.

From my knowledge of how file writes work in node.js, every write is a truncate first followed by a write. Having a file be entirely empty is a sign to me that the truncation succeeds but the write then fails for some reason.

If you run code with --verbose from the command line, we do log a lot of stuff. That might reveal more as to where the error happens:

Please see if you can find any logged output that could help us find the source of the issue:

  • run code --verbose from the command line
  • reproduce the issue
  • attach the log files you find from running the command "Open Logs Folder"

Thanks!

I'll try to collect data like you requested and will report back what I can find.

However I can see that VSCode is doing two separate file operations which opens up this problem:

  1. Open file for read/write
  2. Truncate file
  3. Close file
  4. Open file for read/write
  5. Fail due to sharing violation
  6. Data is lost

Doesn't doing this as two separate operations open VSCode to this kind of data loss problem? Shouldn't it do both operations with the same opened file handle?

brycehutchings commented 1 week ago

In fact, this two-step truncate-then-write explains the problem I think. The truncation results in a file system watcher in another process seeing the file change, opens it, and locks out VSCode from writing the file. So a file system watcher isn't the problem, it just allows this kind of problem to happen.

bpasero commented 1 week ago

@brycehutchings truncating the file before writing it is the pragmatic approach to satisfy most edit operations that are often not just "append at the end of the file" or "replace specifically this part of the file" but rather "insider something here". We have not implemented any other logic for file writing other than this 2-step approach because I believe anything else is not easily doable, but I am happy to learn otherwise.

To mitigate the issue of data loss though, we do immediately mark a document as dirty once it fails to write. So, while the file on disk now is empty, VS Code should preserve the original contents of the file until you save again. If that does not happen, then there is a bug in that part of the logic.

Do you know what file watcher results in the file getting locked?

brycehutchings commented 1 week ago

@brycehutchings truncating the file before writing it is the pragmatic approach to satisfy most edit operations that are often not just "append at the end of the file" or "replace specifically this part of the file" but rather "insider something here". We have not implemented any other logic for file writing other than this 2-step approach because I believe anything else is not easily doable, but I am happy to learn otherwise.

Truncate and then write seems good, but it should be done with one file handle, e.g.:

  1. Open file for read/write
  2. Truncate file to 0 bytes
  3. Write file
  4. Close file

This would be a correct and safe way to apply this pattern. But separating the truncate and the write behind separate Open/Close operations looks like a bug to me.

To mitigate the issue of data loss though, we do immediately mark a document as dirty once it fails to write. So, while the file on disk now is empty, VS Code should preserve the original contents of the file until you save again. If that does not happen, then there is a bug in that part of the logic.

Yes, for the most part. It is very confusing experience though because VSCode offers for me to retry which immediately fails because the file contents have changed. VSCode doesn't realize it was the one that changed the file with its truncate! So you have to do the following the save the file:

  1. Ctrl+S. EBUSY pops up
  2. Click Retry. It fails because the contents changed
  3. Click "Overwrite". Now it is safely saved.

The irony is VSCode thinks it's trying to protect me from overwriting changes from somewhere else. It's scaring people into NOT saving when in fact they need to save to not lose data.

Do you know what file watcher results in the file getting locked?

Yes, it is ServiceHub.Host.dotnet.x64.exe which I believe is part of Visual Studio.

pabermod commented 1 week ago

I think it can be easily reproduced with a .net program that has a file watcher on that file and whenthe edition fires It reads all bytes of the given file.

Also this causes problem un my application because the file is a csharp file that i recompile on the fly

bpasero commented 1 week ago

We have specific code for Windows to separate the truncate from the write operation and it indicates the issues this addresses:

https://github.com/microsoft/vscode/blob/6e42cd0eff31bf1c78db25da035dd1964d55cb99/src/vs/platform/files/node/diskFileSystemProvider.ts#L369-L385

Specifically:

Re-using the same file-descriptor for truncate and writing is a great suggestion that makes a lot of sense to me, considering that this would prevent 3rd party tools from messing with the file while we attempt to write. I just have to see how that would work in our case with the node.js APIs we have and the issues we had before we did this explicit truncate. It is also possible that by now node.js has addressed these issues and the extra truncate is not needed.

bpasero commented 1 week ago

I have https://github.com/microsoft/vscode/pull/231653 now, happy for some feedback.

Essentially instead of:

We now do:

One caveat of doing fs.ftruncate(fd, 0) seems to be that the fd file offset is not reset. But I believe this does not impact us because every fs.write(fd) includes the position to write to:

https://github.com/microsoft/vscode/blob/00b7998db378c227a6365c0784bcb2f0ec615dbd/src/vs/platform/files/node/diskFileSystemProvider.ts#L588

bpasero commented 1 week ago

@brycehutchings @pabermod I have an insider build ready with my change and would like to hear back from you if it changes your experience. You can just run it from the Zip: Download Here

brycehutchings commented 1 week ago

@brycehutchings @pabermod I have an insider build ready with my change and would like to hear back from you if it changes your experience. You can just run it from the Zip: Download Here

Thank you! I will run this as my primary VSCode and see how things go and report back in a day.

brycehutchings commented 1 week ago

Just looking at it in ProcMon, and this insider build looks good there. I can see the truncate and write happen together and in fact now the file watcher from VS is the one that gets the sharing violation because it tries to open the file so quickly, but it looks like it has retry logic (as any file watcher should) so it's all good:

Image

bpasero commented 1 week ago

Fantastic, I really hope this change does not cause any regressions. As we are late in the cycle for the October release I am pushing this out to November as it is a somewhat risky change if it goes bad.

brycehutchings commented 1 week ago

Sounds reasonable. I used it throughout the day and didn't hit the problem anymore, so it looks like it fixed my problem. Thank you!

pabermod commented 1 week ago

@bpasero thanks! I Will try It on monday at work

pabermod commented 3 days ago

Hi @bpasero I have been testing it and it does not cause any more conflicts with external file watchers.

Thanks!

sstativa commented 3 days ago

🤬 just lost a file because of this bug.

I'm using macOS, I've noticed this problem happens to me quite often if I open and save files located on remote volume (mounted via Samba).