dotnet / Open-XML-SDK

Open XML SDK by Microsoft
https://www.nuget.org/packages/DocumentFormat.OpenXml/
MIT License
3.97k stars 541 forks source link

Bug with opening corrupted Open XML documents #1681

Open BenjaminLopVic opened 5 months ago

BenjaminLopVic commented 5 months ago

Describe the bug Opening a corrupted document lock the file.

To Reproduce Steps to reproduce the behavior:

  1. Simulate a corrupted file (simply create a txt file with content in it and change the extension to .docx)
  2. Try opening the file with WordProcessingDocument.Open(...)

Observed behavior The method WordProcessingDocument.Open(...) throws an exception, stating that the file is corrupted but keeps the file locked to. This can be verified using tools like "File Locksmith."

Expected behavior Continue throwing the exception but release the file before.

Desktop

Thanks.

Numpsy commented 5 months ago

I'm seeing this as well. Opening the file myself and passing the stream into WordprocessingDocument.Open instead of the path might be a work around.

twsouthwick commented 2 months ago

I believe this has been fixed in 3.0.2. Please reopen if you're still seeing this

Numpsy commented 2 months ago

If I do something like this

static void Main(string[] args)
{
    try
    {
        using var doc = WordprocessingDocument.Open(@"S:\test\WIDE.doc", false);
    }
    catch
    {
        using var file = File.Open(@"S:\test\WIDE.doc", FileMode.Open, FileAccess.Read);
    }
}

On .NET 8.0 then the call to Open throws System.IO.FileFormatException with the message "File contains corrupted data." and then the call to File.Open fails because the file is still locked - doesn't seem right?

Debugging it, it looks like the call to Package.Open at https://github.com/dotnet/Open-XML-SDK/blob/d9ea8cdf395de8cf61d92b1a7e77173efa14706f/src/DocumentFormat.OpenXml.Framework/Features/StreamPackageFeature.cs#L116 throws, and then that exception escapes into the caller without disposing the FileStream created on the supplied path.

(I don't have an option to reopen the issue myself)

BenjaminLopVic commented 2 months ago

Hello,

I am still experiencing this issue on version 3.0.2 and I am unable to reopen the issue.

Here is a sample code:

using DocumentFormat.OpenXml.Packaging;

Console.WriteLine("Hello, World!");
WordprocessingDocument wDoc = null;
try
{
    wDoc = WordprocessingDocument.Open(@"...\corruptedFile.docx", true);
}
catch (Exception ex)
{
    Console.WriteLine(ex);
}
finally
{
    wDoc?.Dispose();
}
Console.ReadKey();

With this code, the file remains locked by the console until it is shut down.

Thanks.

twsouthwick commented 2 months ago

thanks for the response - I'll take a look