golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.11k stars 17.68k forks source link

os: file write blocks subsequent Open() with "sharing violation" in OneDrive #35219

Open networkimprov opened 5 years ago

networkimprov commented 5 years ago

What version of Go are you using (go version)?

$ go version
go version go1.13.3 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Windows 10, amd64 (go env not available on that machine)

What did you do?

EDIT: updated reproducer

package main

import ( "io/ioutil"; "time" )

func main() {
   aFile := "z35219.txt"
   for {
      err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
      if err != nil { panic(err) }
      aTmr := time.NewTimer(10*time.Second)
      for aRead := true; aRead; {
         select {
         case <-aTmr.C:
            aRead = false
         default:
            _, err := ioutil.ReadFile(aFile)
            if err != nil { panic(err) }
         }
      }
   }
}

What did you expect to see?

No error; this is the case on Win7.

What did you see instead?

On Win10:

panic: open z35219.txt: The process cannot access the file because
 it is being used by another process.

goroutine 1 [running]: main.main() /.../z35219.go:17 +0x10a

Discussion

I see this intermittently in a real app.

EDIT: the following is related to the initial reproducer I wondered whether GetFileAttributesEx() changed in Windows 10 (or 8) to disallow concurrent read access. But the above program still fails (maybe less often?) on Win10 if I patch stat() to only call CreateFile(), like so:

const READ_CONTROL, FILE_READ_ATTRIBUTES uint32 = 0x20000, 0x80 // from WinAPI docs
h, err = CreateFile(namep, READ_CONTROL|FILE_READ_ATTRIBUTES, FILE_SHARE_READ, nil,
                    OPEN_EXISTING, createFileAttrs, 0)

Those arguments are suggested here: https://stackoverflow.com/questions/52813820/createfile-fails-because-of-sharin-violation-because-some-other-os-process-is-us

Stat(): https://golang.org/src/os/stat_windows.go#L76 Open(): https://golang.org/src/syscall/syscall_windows.go#L266

@jstarks any ideas?

cc @alexbrainman @zx2c4 @bcmills @gopherbot add release-blocker OS-Windows

ianlancetaylor commented 5 years ago

Is this problem new in 1.13?

jstarks commented 5 years ago

I'm unable to reproduce this on the latest version of Windows 10 with Go 1.13.1 (I should probably update...).

@networkimprov, do you have third-party antivirus software installed?

networkimprov commented 5 years ago

Hi John, how long did you let the program run? It could take several minutes to show up

The machine seeing these failures is not mine (I only have Win7) but I've forwarded your Q.

Note Go 1.13.3 has a significant change to the Windows runtime, tho it's prob not a factor here.

jstarks commented 5 years ago

I let it run overnight with no repro.

slonopotamus commented 5 years ago

@networkimprov

You only get SHARING_VIOLATION from CreateFile if someone has the same file already open in a such access mode that is not listed in your sharing modes.

For example, I hit SHARING_VIOLATION in git-lfs/git-lfs#3880: one process calls os.Rename while another tries to os.Open destination file.

os.Stat doesn't do CreateFile, so I think your problem is caused by something different. You could try using Process Monitor to record kernel calls related to your file to find out who is opening in a mode that interferes with your app.

networkimprov commented 5 years ago

Thanks for suggestion re Process Monitor.

Assuming this is due to a legitimate system config (ie not malware) any suggestions for fixes? Patching syscall.Open() to retry N times on sharing-violation in a loop with runtime.Gosched()?

one process calls os.Rename while another tries to os.Open destination file.

I tried really hard to get this fixed... #32088 :-(

slonopotamus commented 5 years ago

From my current knowledge, retrying is the only working option. You may want to look at robustio.go file from Go, it has related stuff, but not exposed for public usage.

rsc commented 5 years ago

But the above program still fails (maybe less often?) on Win10 if I patch stat() to only call CreateFile(),

Did you mean patch open? What business would stat have calling CreateFile?

jstarks commented 5 years ago

stat calls CreateFile in a fallback path since apparently there are cases where GetFileAttributesEx fails.

networkimprov commented 5 years ago

@rsc WinAPI CreateFile() doesn't create files, it opens them :-)

networkimprov commented 5 years ago

The box showing this reportedly runs Norton Lifelock. From systeminfo command: OS Version: 10.0.18362 N/A Build 18362

This is a better reproducer. The problem is triggered by ioutil.WriteFile().

package main

import ( "io/ioutil"; "time" )

func main() {
   aFile := "z35219.txt"
   for {
      err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
      if err != nil { panic(err) }
      aTmr := time.NewTimer(10*time.Second)
      for aRead := true; aRead; {
         select {
         case <-aTmr.C:
            aRead = false
         default:
            _, err := ioutil.ReadFile(aFile)
            if err != nil { panic(err) }
         }
      }
   }
}
iwdgo commented 5 years ago

Issue doesn't reproduce on my laptop. The latest submitted code ran for a couple of hours without issue.

>ver

Microsoft Windows [Version 10.0.18362.418]
jstarks commented 5 years ago

In the past I have seen antivirus software keep files open for analysis after the file has been closed, blocking subsequent opens. I do not know whether Norton Lifelock has this problem. It would be interesting to temporarily disable all such security software on the machine (this can be difficult!) and try the repro again. If it stops reproing, then the best course of action would be to report the issue to Symantec.

If it continues to repro, I would suggest looking for any other atypical driver software in the file system stack. Try running fltmc.exe to see if there are any interesting file system filters installed, for example--compare the result to a stock Windows installation.

In any case, I don't think there's any evidence of a Go bug here, and it seems inappropriate for Go to try to work around reliability issues with basic file system APIs. These bugs need to be fixed in the components that are at fault.

slonopotamus commented 5 years ago

it seems inappropriate for Go to try to work around reliability issues with basic file system APIs

Go does that for its internal needs though.

networkimprov commented 5 years ago

Thanks to the folks trying to reproduce this. It's apparently due to OneDrive! After patching syscall.Open() to retry on sharing-violation, the reproducer eventually fails in Read() with "cloud operation was not completed before the time-out period". (Wow, crappy caching :-)

Here's my syscall.Open() patch:

diff --git a/src/syscall/syscall_windows.go b/src/syscall/syscall_windows.go
index af562a8..aa537e2 100644
--- a/src/syscall/syscall_windows.go
+++ b/src/syscall/syscall_windows.go
@@ -265,6 +265,8 @@ func makeInheritSa() *SecurityAttributes {
 }

+// enable retry on ERROR_SHARING_VIOLATION
+var Open_RetryErrorShare = false

 func Open(path string, mode int, perm uint32) (fd Handle, err error) {
        if len(path) == 0 {
@@ -311,8 +313,14 @@ func Open(path string, mode int, perm uint32) (fd Handle, err error) {
        default:
                createmode = OPEN_EXISTING
        }
-       h, e := CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL, 0)
-       return h, e
+       for i := 0; true; i++ {
+               fd, err = CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL, 0)
+               if err != Errno(32) || !Open_RetryErrorShare || i == 999 {
+                       break
+               }
+               runtime.Gosched()
+       }
+       return
 }

 func Read(fd Handle, p []byte) (n int, err error) {
jstarks commented 5 years ago

Interesting. Are you trying this on a path that is backed up by OneDrive?

networkimprov commented 5 years ago

Yes, the reproducers were all running within a OneDrive tree. We're trying them again now in a local-only path...

jstarks commented 5 years ago

I'll try to get a repro on OneDrive and report the issue internally.

networkimprov commented 5 years ago

We haven't reproduced either failure since moving the directory out of the OneDrive tree.

How likely this is to appear might depend on the quality of your connection to OneDrive, traffic on the LAN, etc.

networkimprov commented 5 years ago

This might be a more effective reproducer:

package main

import ( "io/ioutil"; "os"; "time" )

func main() {
   aFile := "z35219.txt"
   err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
   if err != nil { panic(err) }
   go func() {
      aTmr := time.NewTicker(5*time.Second)
      for {
         <-aTmr.C
         err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
         if err != nil { panic(err) }
      }
   }()
   for {
      aFd, err := os.Open(aFile)
      if err != nil { panic(err) }
      err = aFd.Close()
      if err != nil { panic(err) }
   }
}
jstarks commented 5 years ago

I was able to repro this on OneDrive. We're looking into it internally.

networkimprov commented 5 years ago

@jstarks, any developments? Is Go implicated?