ruby-docx / docx

a ruby library/gem for interacting with .docx files
MIT License
431 stars 170 forks source link

Replace placeholders in a paragraph #114

Closed HannesMatthias closed 3 years ago

HannesMatthias commented 3 years ago

Hello, is there a way to reliably replace a placeholder within a paragraph?

Sample paragraph: "Hello [FIRST_NAME] [LAST_NAME]. The meetings starts at [DATE]"

The code below doesn't work correctly, because the placeholders get cut off in the middle of a word sometimes.

For instance I get for each "text run" iteration:

Hello [FIRSTNAME] [LAST # first iteration NAME]. The meetings start at [DATE]" # second iteration

Therefore I cannot replace "[LAST_NAME]".

doc.paragraphs.each do |p|
  p.each_text_run do |tr|
    tr.substitute('_placeholder_', 'replacement value')
  end
end

Thanks in advance

WaKeMaTTa commented 3 years ago
doc.paragraphs.each do |p|
  p.each_text_run do |tr|
    tr.substitute('[LAST_NAME]', 'Matthias')
  end
end
HannesMatthias commented 3 years ago

Hi @WaKeMaTTa, this is what I try to achive, but if I output the result of each "tr" I get something like this:

First iteration: [LAST_ Second iteraion: NAME] ...

The "each_text_run" splits my placeholders sometimes, which is weird.

WaKeMaTTa commented 3 years ago

mmm I didn't know that.

HannesMatthias commented 3 years ago

I just wonder, that nobody recognized it. Do you think, it will get fixed? Here is a word document to reproduce the error.

reproduce.docx

satoryu commented 3 years ago

@HannesMatthias I investigated the docx file you attached. As you may know, docx file is a zip file consists of XML files. one of the XML file is word/document.xml, which represents entire content.

In this file, [EMPLOYEE_LAST_NAME] does not exist as it is split to two word [EMPLOYEE_LAST_ and NAME]

      <w:r w:rsidR="00B53417">
        <w:rPr>
          <w:rFonts w:ascii="Univers" w:hAnsi="Univers"/>
          <w:sz w:val="24"/>
          <w:lang w:val="en-AU"/>
        </w:rPr>
        <w:t>[EMPLOYEE_LAST_</w:t>
      </w:r>
      <w:proofErr w:type="gramStart"/>
      <w:r w:rsidR="00B53417">
        <w:rPr>
          <w:rFonts w:ascii="Univers" w:hAnsi="Univers"/>
          <w:sz w:val="24"/>
          <w:lang w:val="en-AU"/>
        </w:rPr>
        <w:t>NAME]</w:t>
      </w:r>

This is editor's issue, MS Word and Google Docs sometimes split words internally but different from what we see.

I recommend you to use bookmarks to point the place to where you want to insert some text , Docx::Document#bookmarks and Docx::Elements::Bookmark#insert_text_after.

For example, if your docx file has a bookmark named 'last_name', you can insert some text Jack there by using the following code:

doc = Docx::Document.open('/path/to/your/word/file.docx')
bookmarks = doc.bookmarks

bookmark = bookmarks['last_name']
bookmark.insert_text_after('Jack')

but I didn't confirm this script would work as you expect.

HannesMatthias commented 3 years ago

Hello, thanks for the kind feedback. I will definitely try it out!