spf13 / afero

A FileSystem Abstraction System for Go
Apache License 2.0
5.99k stars 514 forks source link

Out of memory when reading large file inside zip file via zipfs #414

Open goodplayer opened 10 months ago

goodplayer commented 10 months ago

Hi. I am trying to read large file(original size ~41G) inside a zip file but got the following issue:

runtime: VirtualAlloc of 5471715328 bytes failed with errno=1455
fatal error: out of memory

runtime stack:
runtime.throw({0xfe6aa6?, 0xc6fabd7000?})
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/panic.go:1077 +0x65 fp=0x7b22bffc58 sp=0x7b22bffc28 pc=0xbb7345
runtime.sysUsedOS(0xc5f8328000, 0x14623c000)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/mem_windows.go:83 +0x1bb fp=0x7b22bffcb8 sp=0x7b22bffc58 pc=0xb96f1b
runtime.sysUsed(...)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/mem.go:77
runtime.(*mheap).allocSpan(0x1397fe0, 0xa311e, 0x0, 0xfd?)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/mheap.go:1351 +0x487 fp=0x7b22bffd58 sp=0x7b22bffcb8 pc=0xba8487
runtime.(*mheap).alloc.func1()
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/mheap.go:968 +0x5c fp=0x7b22bffda0 sp=0x7b22bffd58 pc=0xba7c3c
traceback: unexpected SPWRITE function runtime.systemstack
runtime.systemstack()
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/asm_amd64.s:509 +0x49 fp=0x7b22bffdb0 sp=0x7b22bffda0 pc=0xbe4ca9

goroutine 1 [running]:
runtime.systemstack_switch()
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/asm_amd64.s:474 +0x8 fp=0xc1783078e8 sp=0xc1783078d8 pc=0xbe4c48
runtime.(*mheap).alloc(0x14623c000?, 0xa311e?, 0x0?)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/mheap.go:962 +0x5b fp=0xc178307930 sp=0xc1783078e8 pc=0xba7b9b
runtime.(*mcache).allocLarge(0x0?, 0x14623c000, 0x80?)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/mcache.go:234 +0x85 fp=0xc178307978 sp=0xc178307930 pc=0xb95de5
runtime.mallocgc(0x14623c000, 0x0, 0x0)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/malloc.go:1127 +0x4f6 fp=0xc1783079e0 sp=0xc178307978 pc=0xb8d276
runtime.growslice(0xc4f3492000, 0xc0017300a0?, 0xc3dbf0f000?, 0xfde?, 0xfde?)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/runtime/slice.go:266 +0x4cf fp=0xc178307a50 sp=0xc1783079e0 pc=0xbcbdef
github.com/spf13/afero/zipfs.(*File).fillBuffer(0xc001730000, 0xc18bc5?)
        D:/gopath/pkg/mod/github.com/spf13/afero@v1.11.0/zipfs/file.go:37 +0x165 fp=0xc178307ad8 sp=0xc178307a50 pc=0xf115a5
github.com/spf13/afero/zipfs.(*File).Read(0xc001730000, {0xc000ff7022, 0xfde, 0x8000000000000000?})
        D:/gopath/pkg/mod/github.com/spf13/afero@v1.11.0/zipfs/file.go:62 +0x4c fp=0xc178307b18 sp=0xc178307ad8 pc=0xf117cc
bufio.(*Reader).fill(0xc0000b6600)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/bufio/bufio.go:113 +0x103 fp=0xc178307b50 sp=0xc178307b18 pc=0xed3c83
bufio.(*Reader).ReadSlice(0xc0000b6600, 0xe0?)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/bufio/bufio.go:379 +0x29 fp=0xc178307ba0 sp=0xc178307b50 pc=0xed4469
bufio.(*Reader).collectFragments(0xb8d705?, 0x8?)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/bufio/bufio.go:454 +0x6d fp=0xc178307c60 sp=0xc178307ba0 pc=0xed470d
bufio.(*Reader).ReadString(0xf331e0?, 0x11?)
        D:/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.5.windows-amd64/src/bufio/bufio.go:501 +0x1f fp=0xc178307d28 sp=0xc178307c60 pc=0xed493f

Looks like the File definition in the zipfs package would like to 'fillBuffer' from 0 to current offset before read operation responds. It seems necessary to do so for the random file access scenarios but for sequential accessing large files(which is my use case) it will cause out of memory issue.