Open networkimprov opened 5 years ago
Is this problem new in 1.13?
I'm unable to reproduce this on the latest version of Windows 10 with Go 1.13.1 (I should probably update...).
@networkimprov, do you have third-party antivirus software installed?
Hi John, how long did you let the program run? It could take several minutes to show up
The machine seeing these failures is not mine (I only have Win7) but I've forwarded your Q.
Note Go 1.13.3 has a significant change to the Windows runtime, tho it's prob not a factor here.
I let it run overnight with no repro.
@networkimprov
You only get SHARING_VIOLATION
from CreateFile
if someone has the same file already open in a such access mode that is not listed in your sharing modes.
For example, I hit SHARING_VIOLATION
in git-lfs/git-lfs#3880:
one process calls os.Rename
while another tries to os.Open
destination file.
os.Stat
doesn't do CreateFile
, so I think your problem is caused by something different. You could try using Process Monitor to record kernel calls related to your file to find out who is opening in a mode that interferes with your app.
Thanks for suggestion re Process Monitor.
Assuming this is due to a legitimate system config (ie not malware) any suggestions for fixes? Patching syscall.Open() to retry N times on sharing-violation in a loop with runtime.Gosched()?
one process calls os.Rename while another tries to os.Open destination file.
I tried really hard to get this fixed... #32088 :-(
From my current knowledge, retrying is the only working option. You may want to look at robustio.go
file from Go, it has related stuff, but not exposed for public usage.
But the above program still fails (maybe less often?) on Win10 if I patch stat() to only call CreateFile(),
Did you mean patch open? What business would stat have calling CreateFile?
stat calls CreateFile in a fallback path since apparently there are cases where GetFileAttributesEx fails.
@rsc WinAPI CreateFile() doesn't create files, it opens them :-)
The box showing this reportedly runs Norton Lifelock. From systeminfo command: OS Version: 10.0.18362 N/A Build 18362
This is a better reproducer. The problem is triggered by ioutil.WriteFile().
package main
import ( "io/ioutil"; "time" )
func main() {
aFile := "z35219.txt"
for {
err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
if err != nil { panic(err) }
aTmr := time.NewTimer(10*time.Second)
for aRead := true; aRead; {
select {
case <-aTmr.C:
aRead = false
default:
_, err := ioutil.ReadFile(aFile)
if err != nil { panic(err) }
}
}
}
}
Issue doesn't reproduce on my laptop. The latest submitted code ran for a couple of hours without issue.
>ver
Microsoft Windows [Version 10.0.18362.418]
In the past I have seen antivirus software keep files open for analysis after the file has been closed, blocking subsequent opens. I do not know whether Norton Lifelock has this problem. It would be interesting to temporarily disable all such security software on the machine (this can be difficult!) and try the repro again. If it stops reproing, then the best course of action would be to report the issue to Symantec.
If it continues to repro, I would suggest looking for any other atypical driver software in the file system stack. Try running fltmc.exe to see if there are any interesting file system filters installed, for example--compare the result to a stock Windows installation.
In any case, I don't think there's any evidence of a Go bug here, and it seems inappropriate for Go to try to work around reliability issues with basic file system APIs. These bugs need to be fixed in the components that are at fault.
it seems inappropriate for Go to try to work around reliability issues with basic file system APIs
Go does that for its internal needs though.
Thanks to the folks trying to reproduce this. It's apparently due to OneDrive! After patching syscall.Open() to retry on sharing-violation, the reproducer eventually fails in Read() with "cloud operation was not completed before the time-out period". (Wow, crappy caching :-)
Here's my syscall.Open() patch:
diff --git a/src/syscall/syscall_windows.go b/src/syscall/syscall_windows.go
index af562a8..aa537e2 100644
--- a/src/syscall/syscall_windows.go
+++ b/src/syscall/syscall_windows.go
@@ -265,6 +265,8 @@ func makeInheritSa() *SecurityAttributes {
}
+// enable retry on ERROR_SHARING_VIOLATION
+var Open_RetryErrorShare = false
func Open(path string, mode int, perm uint32) (fd Handle, err error) {
if len(path) == 0 {
@@ -311,8 +313,14 @@ func Open(path string, mode int, perm uint32) (fd Handle, err error) {
default:
createmode = OPEN_EXISTING
}
- h, e := CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL, 0)
- return h, e
+ for i := 0; true; i++ {
+ fd, err = CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL, 0)
+ if err != Errno(32) || !Open_RetryErrorShare || i == 999 {
+ break
+ }
+ runtime.Gosched()
+ }
+ return
}
func Read(fd Handle, p []byte) (n int, err error) {
Interesting. Are you trying this on a path that is backed up by OneDrive?
Yes, the reproducers were all running within a OneDrive tree. We're trying them again now in a local-only path...
I'll try to get a repro on OneDrive and report the issue internally.
We haven't reproduced either failure since moving the directory out of the OneDrive tree.
How likely this is to appear might depend on the quality of your connection to OneDrive, traffic on the LAN, etc.
This might be a more effective reproducer:
package main
import ( "io/ioutil"; "os"; "time" )
func main() {
aFile := "z35219.txt"
err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
if err != nil { panic(err) }
go func() {
aTmr := time.NewTicker(5*time.Second)
for {
<-aTmr.C
err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
if err != nil { panic(err) }
}
}()
for {
aFd, err := os.Open(aFile)
if err != nil { panic(err) }
err = aFd.Close()
if err != nil { panic(err) }
}
}
I was able to repro this on OneDrive. We're looking into it internally.
@jstarks, any developments? Is Go implicated?
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?Windows 10, amd64 (go env not available on that machine)
What did you do?
EDIT: updated reproducer
What did you expect to see?
No error; this is the case on Win7.
What did you see instead?
On Win10:
Discussion
I see this intermittently in a real app.
EDIT: the following is related to the initial reproducer I wondered whether GetFileAttributesEx() changed in Windows 10 (or 8) to disallow concurrent read access. But the above program still fails (maybe less often?) on Win10 if I patch stat() to only call CreateFile(), like so:
Those arguments are suggested here: https://stackoverflow.com/questions/52813820/createfile-fails-because-of-sharin-violation-because-some-other-os-process-is-us
Stat(): https://golang.org/src/os/stat_windows.go#L76 Open(): https://golang.org/src/syscall/syscall_windows.go#L266
@jstarks any ideas?
cc @alexbrainman @zx2c4 @bcmills @gopherbot add release-blocker OS-Windows