dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.17k stars 4.72k forks source link

Issue with zip headers general purpose flags bit 11 for OPC packages. #87658

Closed tomjebo closed 1 year ago

tomjebo commented 1 year ago

Description

This is from https://github.com/dotnet/Open-XML-SDK/issues/1443

The problem is that when using .net core to create a new archive, bit 11 of the general purpose flags is set and this is not supported by the ISO 29500-2 OPC standard that is used by Office Open XML (i.e. Office documents). This makes Office documents created by the Open XML SDK or any code that uses System.IO.Compression or Packaging, non-compliant for ISO 29500-2.

The general purpose flags are described here: http://www.pkware.com/documents/APPNOTE/APPNOTE_6.2.0.txt

Reproduction Steps

Using the Open XML SDK to reproduce this can be simple:

using System.IO.Packaging;

// Create a document by supplying the filepath.
using (WordprocessingDocument wordDocument = WordprocessingDocument.Create(args[0], WordprocessingDocumentType.Document))

...

wordDocument.Save();

Target .Net 7.0, 6.0, 5.0

Expected behavior

Bit 11 should not be set. ISO 29500-2 OPC standard Annex C. (normative) ZIP Appnote.txt Clarifications contains Table C–5. "Support for modes/structures defined by general purpose bit flags" which details the supported and non-supported bit flags. For the purpose of fixing this, turning off bit 11 would suffice.

Actual behavior

Bit 11 is toggle to 1.

Regression?

I verified that .Net Framework does not turn on bit 11 in the same scenario with the test code above.

Known Workarounds

None

Configuration

.Net 7.0 Visual Studio 2022 Open XML SDK (using System.IO.Packaging) C# Windows 11

Other information

No response

tomjebo commented 1 year ago

@carlossanlop @ericstj

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-io-compression See info in area-owners.md if you want to be subscribed.

Issue Details
### Description This is from https://github.com/dotnet/Open-XML-SDK/issues/1443 The problem is that when using .net core to create a new archive, bit 11 of the general purpose flags is set and this is not supported by the ISO 29500-2 OPC standard that is used by Office Open XML (i.e. Office documents). This makes Office documents created by the Open XML SDK or any code that uses System.IO.Compression or Packaging, non-compliant for ISO 29500-2. The general purpose flags are described here: http://www.pkware.com/documents/APPNOTE/APPNOTE_6.2.0.txt The ### Reproduction Steps Using the Open XML SDK to reproduce this can be simple: ```csharp using System.IO.Packaging; // Create a document by supplying the filepath. using (WordprocessingDocument wordDocument = WordprocessingDocument.Create(args[0], WordprocessingDocumentType.Document)) ... wordDocument.Save(); ``` Target .Net 7.0, 6.0, 5.0 ### Expected behavior Bit 11 should not be set. ISO 29500-2 OPC standard Annex C. (normative) ZIP Appnote.txt Clarifications contains Table C–5. "Support for modes/structures defined by general purpose bit flags" which details the supported and non-supported bit flags. For the purpose of fixing this, turning off bit 11 would suffice. ### Actual behavior Bit 11 is toggle to 1. ### Regression? I verified that .Net Framework does not turn on bit 11 in the same scenario with the test code above. ### Known Workarounds None ### Configuration .Net 7.0 Visual Studio 2022 Open XML SDK (using System.IO.Packaging) C# Windows 11 ### Other information _No response_
Author: tomjebo
Assignees: -
Labels: `area-System.IO.Compression`, `untriaged`
Milestone: -
maedula commented 1 year ago

Quick clarification on the expected behavior. Turning off bit 11 alone would not suffice. Bit 1 and 2 have to be set according to the CompressionOption settings, which is currently not the case. That would be inline to be compliant with ISO 29500-2.

Wraith2 commented 1 year ago

As far as i can see the only code that sets those flags uses this enumeration:

internal enum BitFlagValues : ushort { IsEncrypted = 0x1, DataDescriptor = 0x8, UnicodeFileNameAndComment = 0x800 }

so it's not clear how bit 11 is getting set if you're using a ZipArchive.

ericstj commented 1 year ago

The general purpose flags are described here: http://www.pkware.com/documents/APPNOTE/APPNOTE_6.2.0.txt

That is out of date. 6 .2.0 is from 2004. A better reference is https://[pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.9.TXT](https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.9.TXT). Bit 11 was defined in 6.3.2 (2007) to support file names and comment fields encoded as UTF-8.

The problem here is that ZipPackage is hard-coding UTF-8 when creating the archive: https://github.com/dotnet/runtime/blob/19a088e3a12317bdd1d24b33b70c8e92de6330d4/src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs#L261 https://github.com/dotnet/runtime/blob/19a088e3a12317bdd1d24b33b70c8e92de6330d4/src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs#L328 This is what's forcing the zip to always set this bit, regardless of the content.

I believe the fix here is to just stop specifying encoding in ZipPackage and instead let ZipArchive(Entry) determine which encoding to use based on the filenames/comments.