dotnet / Open-XML-SDK

Open XML SDK by Microsoft
https://www.nuget.org/packages/DocumentFormat.OpenXml/
MIT License
4.05k stars 547 forks source link

Cloning sheet doesn't clone Drawing part #1787

Open SzymonSel opened 2 months ago

SzymonSel commented 2 months ago

Describe the bug I have a template xlsx file, with template sheet, which in turn are cloned and their content is replaced ie. Cell values and images. After opening the file, the cell values are unique for each cloned Sheet except for the images which are the same, namely the last replaced image.

Screenshots

image

To Reproduce I have this code which clones a given Sheet:

    public static async void CloneSheet(SpreadsheetDocument spreadSheet, string sheetName, string newSheetName) {
          // Get the source sheet
          var sheets = spreadSheet.WorkbookPart.Workbook.Sheets;
          var sourceSheet = sheets.Elements<Sheet>().FirstOrDefault(s => s.Name == sheetName);
          if (sourceSheet == null) {
              throw new ArgumentException($"Sheet with name {sheetName} does not exist.");
          }

          // Get the source worksheet part
          var sourceSheetPart = (WorksheetPart)spreadSheet.WorkbookPart.GetPartById(sourceSheet.Id);

          // Create a new worksheet part
          var newSheetPart = spreadSheet.WorkbookPart.AddNewPart<WorksheetPart>();
          newSheetPart.Worksheet = (Worksheet)sourceSheetPart.Worksheet.Clone();

          // Clone the relationships
          foreach (var rel in sourceSheetPart.Parts) {
              newSheetPart.AddPart(rel.OpenXmlPart, rel.RelationshipId);
          }

          // Clone DrawingsPart and its related ImageParts
          if (sourceSheetPart.DrawingsPart != null) {
              var sourceDrawingsPart = sourceSheetPart.DrawingsPart;
              DrawingsPart newDrawingsPart;

              if (newSheetPart.DrawingsPart == null) {
                  newDrawingsPart = newSheetPart.AddNewPart<DrawingsPart>();
              } else {
                  newDrawingsPart = newSheetPart.DrawingsPart;
              }

              newDrawingsPart.WorksheetDrawing = (WorksheetDrawing)sourceDrawingsPart.WorksheetDrawing.Clone();

              var imagePartsToClone = sourceDrawingsPart.ImageParts.ToList();

              foreach (var imagePart in imagePartsToClone) {
                  var newImagePart = newDrawingsPart.AddImagePart(imagePart.ContentType);

                  using (var stream = imagePart.GetStream()) {
                          newImagePart.FeedData(stream);
                  }

                  // Update the BlipFill.Blip.Embed.Value to reference the new image part
                  foreach (var blip in newDrawingsPart.WorksheetDrawing.Descendants<DocumentFormat.OpenXml.Drawing.Blip>()) {
                      if (blip.Embed.Value == sourceDrawingsPart.GetIdOfPart(imagePart)) {
                          var newId = newDrawingsPart.GetIdOfPart(newImagePart);
                          blip.Embed.Value = newId;
                      }
                  }
              }
          }

          // Create a new sheet and add it to the workbook
          var newSheetId = spreadSheet.WorkbookPart.GetIdOfPart(newSheetPart);
          var newSheet = new Sheet {
              Id = newSheetId,
              SheetId = sheets.Elements<Sheet>().Max(s => s.SheetId.Value) + 1,
              Name = newSheetName // todo: Ensure the new sheet name is unique
          };

          sheets.Append(newSheet);

          // Save the workbook
          spreadSheet.WorkbookPart.Workbook.Save();
      }

Observed behavior What I discovered is that the Clone() method on the sourceDrawingsPart.WorksheetDrawing doesn't work and in effect, make all the cloned sheets share the DrawingsPart.

Expected behavior How can achieve DrawingsPart cloning to work?

Desktop (please complete the following information):

SzymonSel commented 1 month ago

@twsouthwick is there an ETA on this issue?

twsouthwick commented 1 week ago

@SzymonSel sorry for the delay. I've found some issues in this area, but can't repro your setup as I need to know how you created the initial spreadsheet. Can you provide a fully self-contained repro?

SzymonSel commented 1 week ago

I've created a sample workbook in Excel, containing a single sheet with a sample image and some cell content. Please find below some sample code execution. Template file: Book1.xlsx

Sample code:

private async Task<FileViewModel> TemplateStripped() {
    var template_path = _mediaFileStore.Combine("Templates", "Book1.xlsx");
    var templateSheet = "Sheet1";

    using (var ms = new MemoryStream()) {
        using (var oldMemoryStream = await _mediaFileStore.GetFileStreamAsync(template_path)) {
            await oldMemoryStream.CopyToAsync(ms);
        }

        ms.Position = 0;

        using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(ms, true)) {
            for (int i = 0; i < 4; i++) {
                var newParkSheetName = $"New sheet {i}";
                SpreadsheetService.CloneSheet(spreadSheet, templateSheet, newParkSheetName);
            }

            SpreadsheetService.RemoveSheet(spreadSheet, templateSheet);

            spreadSheet.WorkbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
            spreadSheet.WorkbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
            spreadSheet.WorkbookPart.Workbook.Save();

            SpreadsheetService.ResetView(spreadSheet);
        }

        ms.Position = 0;

        var fileViewModel = new FileViewModel {
            FileName = $"Book1_generated.xlsx",
            ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
            FileStream = ms.ToArray()
        };

        return fileViewModel;
    }
}

The resulting, generated file structure is as such: Image

And the generated file itself: Book1_generated.xlsx

As you can surely notice, the Drawing1.xml isn't cloned, but rather shared between the sheets - now replace an image, replaces it in all the sheets - They're the same picture! ("The Office" reference intended ;))

twsouthwick commented 1 week ago

Can you share the methods such as SpreadsheetService.CloneSheet? These are not in the sdk

twsouthwick commented 1 week ago

nevermind - I see it

twsouthwick commented 1 week ago

Ok - still missing some methods:

In the future, something like this would be ideal:

using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using DocumentFormat.OpenXml.Drawing.Spreadsheet;
using System;
using System.IO;
using System.Linq;

var templateSheet = "Sheet1";

using (var ms = File.Open("result.xlsx", FileMode.Create,FileAccess.ReadWrite))
{
    using (var oldMemoryStream = File.OpenRead("Book1.xlsx"))
    {
        await oldMemoryStream.CopyToAsync(ms);
    }

    ms.Position = 0;

    using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(ms, true))
    {
        for (int i = 0; i < 4; i++)
        {
            var newParkSheetName = $"New sheet {i}";
            CloneSheet(spreadSheet, templateSheet, newParkSheetName);
        }

        RemoveSheet(spreadSheet, templateSheet);

        spreadSheet.WorkbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
        spreadSheet.WorkbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
        spreadSheet.WorkbookPart.Workbook.Save();

        ResetView(spreadSheet);
    }
}

static async void CloneSheet(SpreadsheetDocument spreadSheet, string sheetName, string newSheetName)
{
    // Get the source sheet
    var sheets = spreadSheet.WorkbookPart.Workbook.Sheets;
    var sourceSheet = sheets.Elements<Sheet>().FirstOrDefault(s => s.Name == sheetName);
    if (sourceSheet == null)
    {
        throw new ArgumentException($"Sheet with name {sheetName} does not exist.");
    }

    // Get the source worksheet part
    var sourceSheetPart = (WorksheetPart)spreadSheet.WorkbookPart.GetPartById(sourceSheet.Id);

    // Create a new worksheet part
    var newSheetPart = spreadSheet.WorkbookPart.AddNewPart<WorksheetPart>();
    newSheetPart.Worksheet = (Worksheet)sourceSheetPart.Worksheet.Clone();

    // Clone the relationships
    foreach (var rel in sourceSheetPart.Parts)
    {
        newSheetPart.AddPart(rel.OpenXmlPart, rel.RelationshipId);
    }

    // Clone DrawingsPart and its related ImageParts
    if (sourceSheetPart.DrawingsPart != null)
    {
        var sourceDrawingsPart = sourceSheetPart.DrawingsPart;
        DrawingsPart newDrawingsPart;

        if (newSheetPart.DrawingsPart == null)
        {
            newDrawingsPart = newSheetPart.AddNewPart<DrawingsPart>();
        }
        else
        {
            newDrawingsPart = newSheetPart.DrawingsPart;
        }

        newDrawingsPart.WorksheetDrawing = (WorksheetDrawing)sourceDrawingsPart.WorksheetDrawing.Clone();

        var imagePartsToClone = sourceDrawingsPart.ImageParts.ToList();

        foreach (var imagePart in imagePartsToClone)
        {
            var newImagePart = newDrawingsPart.AddImagePart(imagePart.ContentType);

            using (var stream = imagePart.GetStream())
            {
                newImagePart.FeedData(stream);
            }

            // Update the BlipFill.Blip.Embed.Value to reference the new image part
            foreach (var blip in newDrawingsPart.WorksheetDrawing.Descendants<DocumentFormat.OpenXml.Drawing.Blip>())
            {
                if (blip.Embed.Value == sourceDrawingsPart.GetIdOfPart(imagePart))
                {
                    var newId = newDrawingsPart.GetIdOfPart(newImagePart);
                    blip.Embed.Value = newId;
                }
            }
        }
    }

    // Create a new sheet and add it to the workbook
    var newSheetId = spreadSheet.WorkbookPart.GetIdOfPart(newSheetPart);
    var newSheet = new Sheet
    {
        Id = newSheetId,
        SheetId = sheets.Elements<Sheet>().Max(s => s.SheetId.Value) + 1,
        Name = newSheetName // todo: Ensure the new sheet name is unique
    };

    sheets.Append(newSheet);

    // Save the workbook
    spreadSheet.WorkbookPart.Workbook.Save();
}

Even better would be unit tests like #1814 that linked a repo that has an easily cloneable set of tests.

SzymonSel commented 1 week ago

Ok, got it! I will do better next time. Thanks for the guidndance and sorry for the inconvenience. Below are the missing methods.

public static void ResetView(SpreadsheetDocument spreadSheet, string sheetName = "") {
    WorkbookPart workbookPart = spreadSheet.WorkbookPart;
    Sheet firstSheet = workbookPart.Workbook.Descendants<Sheet>().FirstOrDefault();
    if (firstSheet != null) {
        sheetName = firstSheet.Name;
    }

    Sheet sheet = spreadSheet.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().First(s => s.Name == sheetName);
    string relationshipId = sheet.Id.Value;
    WorksheetPart worksheetPart = (WorksheetPart)spreadSheet.WorkbookPart.GetPartById(relationshipId);

    Row firstRow = worksheetPart.Worksheet.Descendants<Row>().FirstOrDefault();
    Cell firstCell = firstRow?.Descendants<Cell>().FirstOrDefault();
    string firstCellReference = firstCell?.CellReference?.Value;

    SheetViews sheetViews = worksheetPart.Worksheet.GetFirstChild<SheetViews>();
    if (sheetViews == null) {
        sheetViews = new SheetViews();
        worksheetPart.Worksheet.InsertAfter(sheetViews, worksheetPart.Worksheet.Elements<SheetData>().FirstOrDefault());
    }

    SheetView sheetView = sheetViews.Elements<SheetView>().FirstOrDefault();
    if (sheetView == null) {
        sheetView = new SheetView() { WorkbookViewId = 0 };
        sheetViews.Append(sheetView);
    } else {
        sheetView.RemoveAllChildren<Selection>();
    }

    sheetView.RemoveAllChildren<Selection>();

    Selection selection = new Selection() { ActiveCell = "A1", SequenceOfReferences = new ListValue<StringValue>() { InnerText = "A1" } };
    sheetView.Append(selection);

    worksheetPart.Worksheet.Save();
}

public static void RemoveSheet(SpreadsheetDocument spreadSheet, string sheetName) {
    var theSheet = spreadSheet.WorkbookPart.Workbook.Descendants<Sheet>().FirstOrDefault(s => s.Name == sheetName);
    if (theSheet == null) {
        return;
    }

    var worksheetPart = (WorksheetPart)spreadSheet.WorkbookPart.GetPartById(theSheet.Id);
    spreadSheet.WorkbookPart.DeletePart(worksheetPart);
    theSheet.Remove();

    if (spreadSheet.WorkbookPart.CalculationChainPart != null) {
        spreadSheet.WorkbookPart.DeletePart(spreadSheet.WorkbookPart.CalculationChainPart);
    }

    spreadSheet.WorkbookPart.Workbook.Save();
    spreadSheet.Save();
}
SzymonSel commented 1 week ago

Even better would be unit tests like #1814 that linked a repo that has an easily cloneable set of tests.

The unit test are a pretty neat idea! I'll have I go at it next time though, since I'm not very fluent with the intricacies of the SDK and I personally struggeld to programaticaly test/debug the root of the problem. It was rather a game of Minesweeper by a 3 year old :)

twsouthwick commented 1 week ago

haha no problem. whatever you can give us is a help - at least something that can be run outside of your environment is a huge help (I can easily copy/paste and run it).

Taking a look now....

twsouthwick commented 1 week ago

I think this is user error and not an issue in the SDK. If I remove the following part:

-    // Clone the relationships
-    foreach (var rel in sourceSheetPart.Parts)
-    {
-        newSheetPart.AddPart(rel.OpenXmlPart, rel.RelationshipId);
-    }

then it does appear to clone it. Am I missing something?

SzymonSel commented 1 week ago

Not quite, this produces a malformed document: Image

Although, as I can see, the drawing parts have indeed been cloned: Image

But nevertheless this time, the ImageParts are missing completly.

The foreach loop is an attempt to mimic a deep clone, since it became clear, that a deep clone is not having place.

twsouthwick commented 1 week ago

This is sounding more like a user error rather than an SDK issue.

@mikeebowen @tomjebo can you take a look at this?

SzymonSel commented 1 week ago

Shouldn't Clone() work recursively? I would imagine not of the code between the lines newSheetPart.Worksheet = (Worksheet)sourceSheetPart.Worksheet.Clone(); and // Create a new sheet and add it to the workbook shouldn't really be needed.