dotnet / Open-XML-SDK

Open XML SDK by Microsoft
https://www.nuget.org/packages/DocumentFormat.OpenXml/
MIT License
3.99k stars 544 forks source link

Performance degradation for WordprocessingDocument (v3.1.0 vs v2.20.0 vs v2.5.0) #1770

Open bhargavgaglani07 opened 1 month ago

bhargavgaglani07 commented 1 month ago

Describe the bug Updating a word document operation has degraded significantly in the latest version. In v2.5.0, updating a word document (SampleDocument.docx) takes on an average ~310ms but same operation takes around ~1610ms (5x) in v2.20.0 and ~1370ms (4x) in v3.1.0.

To Reproduce Use below sample code with attached document (SampleDocument.docx)


string filePath = @"SampleDocument.docx";
string outPath = $@"UpdatedSampleDocument.docx";
Stopwatch stopwatch = new Stopwatch();

var bytes = File.ReadAllBytes(filePath);
MemoryStream stream = new MemoryStream();
stream.Position = 0;
stream.Write(bytes, 0, bytes.Length);

stopwatch.Start();

// Open the document
using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(stream, true))
{
    wordDocument.MainDocumentPart.Document.Body.PrependChild(new Paragraph(new ParagraphProperties(), new Run(new Text(DateTime.Now.ToString()))));
    wordDocument.MainDocumentPart.Document.Body.PrependChild(new Paragraph(new ParagraphProperties(), new Run(new Text(DateTime.Now.ToString()))));
    wordDocument.MainDocumentPart.Document.Body.PrependChild(new Paragraph(new ParagraphProperties(), new Run(new Text(DateTime.Now.ToString()))));
    wordDocument.MainDocumentPart.Document.Body.PrependChild(new Paragraph(new ParagraphProperties(), new Run(new Text(DateTime.Now.ToString()))));
    wordDocument.MainDocumentPart.Document.Body.PrependChild(new Paragraph(new ParagraphProperties(), new Run(new Text(DateTime.Now.ToString()))));

     //// Save the document

    //// for v2.20.0
    wordDocument.MainDocumentPart.Document.Save();
    wordDocument.Package.Flush();

    //// for v3.1.0 & v2.20.0
    wordDocument.Save();
}

stopwatch.Stop();

File.WriteAllBytes(outPath, stream.ToArray());
stream.Close();
stream.Dispose();
Console.WriteLine($"Time taken to save the document: {stopwatch.ElapsedMilliseconds} ms");

Observed behavior Upgrading to the latest version has significant performance degradation for mentioned operation.

Expected behavior Upgrading to the latest version should not degrade performance.

Desktop (please complete the following information):

ChristianSfr commented 1 month ago

The regression doesn't seem specific to the type of generated document.

I have the same observation when generating a spreadsheet of 3600 lines of 14 columns. Same Net8 code, 45s with 3.1.0 instead of 13s with 3.0.2.