EvotecIT / OfficeIMO

Fast and easy to use cross-platform .NET library that creates or modifies Microsoft Word (DocX) and later also Excel (XLSX) files without installing any software. Library is based on Open XML SDK
MIT License
289 stars 50 forks source link

Issue with Text Replacement in Multi-Line Text Boxes #224

Closed HamzaBeg closed 6 months ago

HamzaBeg commented 6 months ago

I am experiencing a problem with the text replacement functionality within multi-line text boxes using OfficeImoWord. Specifically, the function only recognizes and modifies placeholders in the first line of a text box, neglecting any placeholders in subsequent lines. This issue is impacting our ability to generate dynamic documents efficiently, where placeholders span multiple lines within text boxes.

Detailed Issue Description: In scenarios where I attempt to replace placeholders across multiple lines within a single text box, only the placeholders in the first line are detected and replaced. Any subsequent placeholders that match the replacement criteria remain unaffected. This behavior suggests that the text replacement function may not be fully iterating through all text segments within a text box. During my debugging sessions, I found that the TextBoxes array, which should contain all lines of text within a text box, only includes text from the first line. This observation indicates that the function responsible for populating this array might not be capturing the entire content of text boxes, thereby affecting the completeness of the text replacement process.

Code Snippet Demonstrating the Issue:

using WordDocument word1Doc = WordDocument.Load(path);

var replacePlaceholderDictionary = new Dictionary<string, string>
{
    { "PLACEHOLDER_ONE", "Placeholder1" },
    { "PLACEHOLDER_TWO", "Placeholder2" },
};

foreach (var term in replacePlaceholderDictionary)
{
    string pattern = $"{{{term.Key}}}";

    word1Doc.TextBoxes.ForEach(
        textBox =>
            textBox.Text = textBox.Text.Replace(pattern, term.Value)
    );
}

Here is picture of textbox in word: image

and here is result of my code: image

Steps to Reproduce: Create a Word document and insert a text box.

  1. Within the text box, input text with placeholders spanning multiple lines, for example:
  2. Line 1: {PLACEHOLDER_ONE}
  3. Line 2: {PLACEHOLDER_TWO}
  4. Use the text replacement function to replace {PLACEHOLDER_ONE} and {PLACEHOLDER_TWO} with specific values.
  5. Observe that only the placeholder in Line 1 is replaced, while Line 2 remains unchanged.

Could you please investigate this issue? I would appreciate guidance on whether this is a known limitation, and if there are any potential workarounds or upcoming fixes. Additionally, information on when a fix might be expected would be highly beneficial.

Technical Environment:

OfficeIMO Version: OfficeIMO.Word Version="0.13.0" Operating System: Windows 11

PrzemyslawKlys commented 6 months ago

The issue comes from the wrong implementation of WordTextBox Text and WordParagraph properties. Those should actually be lists.

image

image

This should be fixed in:

https://github.com/EvotecIT/OfficeIMO/blob/e081e2a1781faff54e0079b04fc32ea3e42e1c0b/OfficeIMO.Word/WordTextBox.cs#L67-L105

Instead of first child we should get all runs and and create List and List. After that everything should be fine.