isaacs / node-graceful-fs

fs with incremental backoff on EMFILE
ISC License
1.27k stars 148 forks source link

Error: EPERM: operation not permitted, rename '...' -> '...' #104

Open ma-zal opened 7 years ago

ma-zal commented 7 years ago

Last graceful-fs 4.1.11 causes Error: EPERM: operation not permitted, rename '...' -> '...' on Windows platform.

My investigation:

It happens with session-file-store, that has dependency fs-extra, that has dependency graceful-fs. Problem occurs when user make more parallel HTTP calls to server. With the previous version graceful-fs 4.1.10 it works fine.

All start works without errors when I rollback the code in graceful-fs 4.1.11 -> polyfills.js -> line 101 from:

setTimeout(function() {
  fs.stat(to, function (stater, st) {
    if (stater && stater.code === "ENOENT")
      fs$rename(from, to, CB);
    else
      cb(er)
    })
  }, backoff)

to previous

setTimeout(function() {
  fs$rename(from, to, CB);
}, backoff)
ma-zal commented 7 years ago

There is discussion about similar issue https://github.com/npm/npm/issues/12059#issuecomment-269826951 . Based on it, may be antivirus McAfee can cause this issue.

I have McAfee too. But I have no permission to disable this antivirus on my corporate machine. :( So I'm not able to test it.

AzGoalie commented 7 years ago

I do not have McAfee and still have this issue. Though we are running npm install in parallel against many different projects.

bchr02 commented 7 years ago

I am getting this same error on Windows Server 2008 R2 even without an Anti Virus installed.

bchr02 commented 7 years ago

@ma-zal for your information, I just started using connect-loki and it's working great. Almost a drop in replacement for session-file-store. It persists data to a single json file.

about-code commented 7 years ago

I recently ran into the rename EPERM-issues when installing packages of mine on a windows machine. I could reproduce it for all package versions which I deployed together with an npm-shrinkwrap.json. Versions without that file installed without the permission issues. Can anyone confirm this behavior?

ferm10n commented 7 years ago

@about-code Yes, I can confirm. I was encountering it while using express-session and session-file-store. After doing a lot of research, I now believe that it's because the rename operation is requested while the file is still open and in use.

Pretty sure this is relevant

about-code commented 7 years ago

@ferm10n Thank you replying to my question. Neither the master branch of expess-session nor the one of session-file-store seem to contain an npm-shrinkwrap.json file, though. I guess you confirm the issue in general but not the relation to having a npm-shrinkwrap.json, do you? Nevertheless, your link seems to point to the real cause. Having a package with a shrinkwrap.json and many third party dependencies may indeed just increased the likelihood to run into the issue in my case.

ferm10n commented 7 years ago

Actually yes, I should have added that having a npm-shrinkwrap removed the error for me.

However I'm hesitant to point that out because I feel that is only removing the error, not the problem. The link from my previous comment addresses a real issue. Previous versions just ignore it.

I attempted getting a real solution, but this issue is very difficult thing to work with, hard to debug, and challenging to recreate. The dependencies inbetween session-file-store and graceful-fs are rather messy and outdated. I've made a PR for one of them here! https://github.com/npm/slide-flow-control/pull/10 But I am pretty confident that the root issue lies in write-file-atomic, which I'll be opening a PR later today that removes it's dependency on the ugly slide-flow-control and substitutes it with promises. Later I'll try to make another PR that should fix this issue

ferm10n commented 7 years ago

@about-code I believe I have confirmed the root of the problem. You can cause a EPERM issue when multiple write operations are issued concurrently on a Windows system. The reason why this isn't an issue on Unix like systems is due to the nature of the POSIX rename, which simply replaces the file. On windows I guess it's a little more complicated.

I believe this is not an issue with graceful-fs, but write-file-atomic, which should probably serialize write operations to the same file. I'll make a PR for that as well. Do you think a separate issue should be opened on write-file-atomic? I'd appreciate your input! Write-file-atomic has a LOT of dependents and I'd imagine this issue could negatively affect all Windows users of these packages.

about-code commented 7 years ago

@ferm10n I am not involved in graceful-fs or node/npm development and can't speak on behalf of its developers but if you think there's an issue in write-file-atomic I think it's sane to open up an issue there and link it to this one. It should be okay to provide a PR to write-file-atomic and if you have doubts about it affecting many Windows users just state that clearly in your PR so the maintainer has a chance to get aware of this.

From reading issue npm/npm/issues/12059 in the official npm repo write-file-atomic seems to be somehow involved, so I think you could be on the right track.

roberthartung commented 7 years ago

This also happens in my use case, where I am using express and socket.io, where the session is shared to the sockets.

HaykoKoryun commented 7 years ago

@bchr02, I always wondered why my app was slow to render pages then noticed lots of sessions files being created and deleted for every request. Looked at the logs and surely enough it's this issue.

Just switched to Loki and everything is way faster.

bchr02 commented 7 years ago

@HaykoKoryun I'm glad switching worked for you too.

ferm10n commented 7 years ago

I found it really helpful to diagnose the cause by using Process Monitor from microsoft to inspect the write operations in flight. I discovered that on top of session-file-store attempting to write several times to the same file simultaneously (uhg) node.exe was competing with the windows search indexer, SearchProtocolHost.exe...

But even after serializing the writes and telling the indexer to bugger off, I still occasionally get the ACCESS DENIED and EPERM result.

node.exe is the only process accessing the session files... Does anyone have any ideas how to determine the reason access is being denied?

mekwall commented 7 years ago

Please don't blame anti-viruses or other third-party tools for this. This is a recurrent and elusive issue with node on Windows, and it has to do with race-conditions due to node not handling Windows file operations properly (hence the need for this package to begin with). Active anti-virus scanners or other tools that might lock files or add a delay to IO operations may cause this otherwise hidden race-condition to trigger.

Yuuoniy commented 6 years ago

@bchr02 that works for me too, thanks a lot!

timsar2 commented 6 years ago

try to add full permission on npm folder C:\Users\YourAccount_Aame\AppData\Roaming\ just right click on npm and from security tab add full permison to your user account or add everyone(if security is not important) then set full permision to everyone.

rjgotten commented 6 years ago

@mekwall

node not handling Windows file operations properly

I guess you could say that that is the actual elephant in the room. which no-one has been able or willing to fully address. (None too surprising given the amount of pushback you'd get slapped with by the core Node team.)

mifi commented 4 years ago

Please don't blame anti-viruses or other third-party tools for this. This is a recurrent and elusive issue with node on Windows, and it has to do with race-conditions due to node not handling Windows file operations properly (hence the need for this package to begin with). Active anti-virus scanners or other tools that might lock files or add a delay to IO operations may cause this otherwise hidden race-condition to trigger.

Isn't nodejs handling it properly? Node.js is just returning an equivalent of the error code that the Windows/Linux API is giving it. The problem here as I see it is that the Windows rename function works differently than Linux/MacOS. On the latter platforms the rename operation is almost guaranteed to be a simple atomic operation that is very likely to succeed, but on Windows this may or may not fail depending on anti-virus, any open file handles, the moon phase or some other factor. Node.js just presents a simple wrapper around the native API, so I'm not sure how much more Node.js should do, because there is not one single answer to what is right to do. If you REALLY need to have the file renamed (and can wait) then retry for a few seconds (or use graceful-fs). If it's more important that you have responsiveness and flexibility, then use the fs function directly.

What needs to be done, as I see it, is to improve the documentation on the Node fs rename function (and graceful-fs), to make developers aware of what kind of cases they need to handle when using the function.

And in the case of this issue, maybe improve this logic https://github.com/isaacs/node-graceful-fs/commit/90a96bc2104729e51d5d7b4aebc8c584ccf7d2ce so that it only aborts retrying if the target path is a directory, not if it's a file?

rjgotten commented 4 years ago

so I'm not sure how much more Node.js should do,

Node.js could offer updated I/O abstractions that allow waiting until a file becomes available for access. If these involve OS-dependent mechanisms, then it's better for that to be part of the core runtime than for it to have to come from packages as optional dependencies; applicable to filtered platforms; with bindings to native code involved.

More importantly; NPM should sit down and have a long and hard think on how their IO is handled. In particular in the context of false assumptions like rename being atomic.

mifi commented 4 years ago

It would be nice with a core abstraction, but I think the problem is that Windows does not seem to guarantee atomic rename/replace, so advertising a rename API with the guarantee of POSIX atomicity is wrong. Better to be honest in the documentation that there are situations that need to be handled on Windows for the user of the Node.js API.

rjgotten commented 4 years ago

I think the problem is that Windows does not seem to guarantee atomic rename/replace,

It apparently is supported on Windows 10: https://stackoverflow.com/a/51737582

And before that with Windows 7 you had the ability to use ReplaceFile (which does atomically move atleast the file data itself, but not necessarily all the metadata like time modified) and before that with Vista there were NTFS transactions.

The problem isn't the atomicity guarantee. The problem is that POSIX rename guarantees success even if something is still using the file, whereas Windows does not.

That's why it is necessary that there are APIs which can wait until a file becomes 'unlocked'.

mifi commented 4 years ago

And before that with Windows 7 you had the ability to use ReplaceFile (which does atomically move atleast the file data itself, but not necessarily all the metadata like time modified) and before that with Vista there were NTFS transactions.

Yea I saw that. I agree that it would be nice with a unified core API for rename. So basically someone needs to write a windows abstraction that uses a combination of ReplaceFile and MoveFileEx depending on which Windows version, which filesystem (FAT does not support atomic renames afaik) is running and correctly handles the different error cases in those different methods.

The problem isn't the atomicity guarantee. The problem is that POSIX rename guarantees success even if something is still using the file, whereas Windows does not.

Right, the problem in this specific EPERM issue is not the atomicity, but the atomicity is a different yet important problem because many developers and packages including the extremely popular write-file-atomic assumes that the fs.rename function is atomic.

AlanImaginx commented 3 years ago

The problem is that fs.rename wrongly calls MoveFileEx when it ought to be calling NtSetInformationFile with the argument FILE_RENAME_INFORMATION with the flags FILE_RENAME_REPLACE_IF_EXISTS | FILE_RENAME_POSIX_SEMANTICS.

I suspect that the failure to request FILE_RENAME_POSIX_SEMANTICS is the root of the problem. fs.rename assumes that rename() is atomic per the POSIX (Linux) standard but is failing to ask Windows to provide said semantics.

jangobot commented 1 year ago

any updates here?

JeremyTellier commented 1 year ago

Does anyone have a solution for this that works? It is filling my logs and nothing seems to work,

rjgotten commented 1 year ago

@JeremyTellier Does anyone have a solution for this that works?

Yes. Either get Node.js to patch fs.rename to have the proper POSIX semantics on Windows, as per what AlanImaginx posted. Or get node-graceful-fs to use a completely custom rename operation based on its own native module, that does so.

Both are impractical. But asking the OS for the proper semantics is the only way to truly solve this. Everything else is a half-bake of a hacked workaround and won't get rid of errors altogether. It'll just make them less likely to crop up, or will mitigate them by offering a transparent (and horrendously inefficient) retry-until-success.

xmedeko commented 7 months ago

@AlanImaginx Can you open issue at https://github.com/nodejs/node ?