golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.94k stars 17.66k forks source link

archive/zip: improve Zip64 compatibility with 7z #69415

Open Sangmin-Simon-Lee opened 1 month ago

Sangmin-Simon-Lee commented 1 month ago

Go version

go1.23.1

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/sangmin5.lee/.cache/go-build'
GOENV='/home/sangmin5.lee/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/sangmin5.lee/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/sangmin5.lee/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/sangmin5.lee/dev/go/goroot'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/sangmin5.lee/dev/go/goroot/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/sangmin5.lee/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4058537502=/tmp/go-build -gno-record-gcc-switches'

What did you do?

** Prepare two files with each size of 5G, 1M $ touch test_5G $ shred -n 1 -s 5G test_5G $ touch test_1M $ shred -n 1 -s 1M test_1M

** Create zipfile to have 5G file zipfile, err := os.Create("s5G.zip") zipWriter := zip.NewWriter(zipfile) newfile, err := os.Open("test_5G")

fileInfo, err := newfile.Stat() header, err := zip.FileInfoHeader(fileInfo)

header.Name = "test_5G" header.Method = zip.Deflate

writer, err := zipWriter.CreateHeader(header) _, err = io.Copy(writer, newfile)

** Get 7z from https://sourceforge.net/projects/sevenzip/files/7-Zip/23.01/ or higher and try to add 1M file to created zip

$ 7zz a s5G.zip test_1M

What did you see happen?

7-Zip (z) 23.01 (x86) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 32-bit ILP32 locale=en_US.utf8 Threads:96 OPEN_MAX:131072, ASM

Open archive: s5G.zip

WARNINGS: Headers Error

-- Path = s5G.zip Type = zip WARNINGS: Headers Error Physical Size = 5370519708 64-bit = + Characteristics = Zip64

Scanning the drive: 1 file, 1048576 bytes (1024 KiB)

Updating archive: s5G.zip

Keep old data in archive: 1 file, 5368709120 bytes (5120 MiB) Add new data to archive: 1 file, 1048576 bytes (1024 KiB)

System ERROR: E_NOTIMPL : Not implemented

What did you expect to see?

Everything is OK without errs and the contents should be listed

$ unzip -l 5G.zip Archive: 5G.zip Length Date Time Name


1048576 2024-09-11 07:47 test_1M 5368709120 2024-09-11 07:50 test_5G


5369757696 2 files

gabyhelp commented 1 month ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Sangmin-Simon-Lee commented 1 month ago

I'd like to propose the potential fix for this issue.

https://go-review.googlesource.com/c/go/+/612595

timothy-king commented 1 month ago

CC @dsnet, @bradfitz, @ianlancetaylor.

ianlancetaylor commented 1 month ago

Quoting https://go.dev/cl/612595:

Symptoms: An error occurs when 7z adds or updates files in a zip archive that includes files over 4GB.

Reasons:

  1. Header Inconsistency: The main header writes a 32-bit value even though a 64-bit value is written to the Zip64 header. The offset value should be set to uint32max (0xFFFFFFFF) to indicate that a 64-bit value is used.

  2. Zip64 Detection: 7z primarily uses the Extra Field in the Local File Header to detect Zip64 format. If this field is missing, 7z assumes 32bit Data Descriptor, leading to errors. The Extra Field should include the Zip64 information in the Local File Header, even if Data Descriptor is used.

Solution:

  1. Ensure that the offset value in the main header is set to 0xFFFFFFFF when writing 64-bit values in the Zip64 header.
  2. Include the Zip64 Extra Field in the Local File Header to align with 7z handling of Zip64 archives.
ianlancetaylor commented 1 month ago

CC @dsnet

nightlyone commented 1 month ago

Note that changing this might invalidate checksums based on the full content of zip files for files created before this change.

Go modules containing files greater than 4 GB could be affected.

That behavior can be made opt-in or at least needs a Go experiment flag to opt out.