Closed tomjebo closed 1 year ago
@carlossanlop @ericstj
Tagging subscribers to this area: @dotnet/area-system-io-compression See info in area-owners.md if you want to be subscribed.
Author: | tomjebo |
---|---|
Assignees: | - |
Labels: | `area-System.IO.Compression`, `untriaged` |
Milestone: | - |
Quick clarification on the expected behavior. Turning off bit 11 alone would not suffice. Bit 1 and 2 have to be set according to the CompressionOption settings, which is currently not the case. That would be inline to be compliant with ISO 29500-2.
As far as i can see the only code that sets those flags uses this enumeration:
internal enum BitFlagValues : ushort { IsEncrypted = 0x1, DataDescriptor = 0x8, UnicodeFileNameAndComment = 0x800 }
so it's not clear how bit 11 is getting set if you're using a ZipArchive.
The general purpose flags are described here: http://www.pkware.com/documents/APPNOTE/APPNOTE_6.2.0.txt
That is out of date. 6 .2.0 is from 2004. A better reference is https://[pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.9.TXT](https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.9.TXT). Bit 11 was defined in 6.3.2 (2007) to support file names and comment fields encoded as UTF-8.
The problem here is that ZipPackage is hard-coding UTF-8 when creating the archive: https://github.com/dotnet/runtime/blob/19a088e3a12317bdd1d24b33b70c8e92de6330d4/src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs#L261 https://github.com/dotnet/runtime/blob/19a088e3a12317bdd1d24b33b70c8e92de6330d4/src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs#L328 This is what's forcing the zip to always set this bit, regardless of the content.
I believe the fix here is to just stop specifying encoding in ZipPackage and instead let ZipArchive(Entry) determine which encoding to use based on the filenames/comments.
Description
This is from https://github.com/dotnet/Open-XML-SDK/issues/1443
The problem is that when using .net core to create a new archive, bit 11 of the general purpose flags is set and this is not supported by the ISO 29500-2 OPC standard that is used by Office Open XML (i.e. Office documents). This makes Office documents created by the Open XML SDK or any code that uses System.IO.Compression or Packaging, non-compliant for ISO 29500-2.
The general purpose flags are described here: http://www.pkware.com/documents/APPNOTE/APPNOTE_6.2.0.txt
Reproduction Steps
Using the Open XML SDK to reproduce this can be simple:
Target .Net 7.0, 6.0, 5.0
Expected behavior
Bit 11 should not be set. ISO 29500-2 OPC standard Annex C. (normative) ZIP Appnote.txt Clarifications contains Table C–5. "Support for modes/structures defined by general purpose bit flags" which details the supported and non-supported bit flags. For the purpose of fixing this, turning off bit 11 would suffice.
Actual behavior
Bit 11 is toggle to 1.
Regression?
I verified that .Net Framework does not turn on bit 11 in the same scenario with the test code above.
Known Workarounds
None
Configuration
.Net 7.0 Visual Studio 2022 Open XML SDK (using System.IO.Packaging) C# Windows 11
Other information
No response