PHPOffice / PHPWord

A pure PHP library for reading and writing word processing documents
https://phpoffice.github.io/PHPWord/
Other
7.25k stars 2.69k forks source link

How to use bookmarks in PhpWord #2040

Open xTwisten opened 3 years ago

xTwisten commented 3 years ago

I searched but i didn't find anything about bookmarks.

I created template on word and insert some bookmarks, but i don't know how i can replace the bookmarks by string in php.

I tried :

$doc = new TemplateProcessor("template.docx");
$doc->setValue('phone', "1234567890");
$doc->saveAs("output.docx");

AND

$templateProcessor = new \PhpOffice\PhpWord\TemplateProcessor("template.docx");
$templateProcessor->replaceBookmark('phone', '1234567890');
$templateProcessor->saveAs('output\output_'.date("Y-m-d_H-i-s").'.docx');

I add that in the "TemplateProcessor.php" file.

public function replaceBookmark($search, $replace)
    {
        if (is_array($replace)) {
            foreach ($replace as &$item) {
                $item = self::ensureUtf8Encoded($item);
            }
        } else {
            $replace = self::ensureUtf8Encoded($replace);
        }

        if (Settings::isOutputEscapingEnabled()) {
            $xmlEscaper = new Xml();
            $replace = $xmlEscaper->escape($replace);
        }

        foreach ($this->tempDocumentHeaders as $index => $xml) {
            $xml = $this->setBookmarkForPart($search, $replace, $xml);
        }
        $this->tempDocumentMainPart = $this->setBookmarkForPart($search, $replace, $this->tempDocumentMainPart);
        foreach ($this->tempDocumentFooters as $index => $xml) {
            $xml = $this->setBookmarkForPart($search, $replace, $xml);
        }

    }
    protected function setBookmarkForPart($search, $replace, $documentPartXML)
    {
        $regExpEscaper = new RegExp();
        $pattern = '~<w:bookmarkStart\s+w:id="(\d*)"\s+w:name="'.$search.'"\s*\/>()~mU';
        $searchstatus = preg_match($pattern, $documentPartXML, $matches, PREG_OFFSET_CAPTURE);
        if($searchstatus){
            $startbookmark = $matches[2][1];
            $pattern = '~(<w:bookmarkEnd\s+w:id="'.$matches[1][0].'"\s*\/>)~mU';
            $searchstatus = preg_match($pattern, $documentPartXML, $matches, PREG_OFFSET_CAPTURE, $startbookmark);
            if($searchstatus){
                $endbookmark = $matches[1][1];
                $count = 0;
                $startpos = $startbookmark;
                $pattern = '~(<w:t[\s\S]*>)([\s\S]*)(<\/w:t>)~mU';
                do{
                    $searchstatus = preg_match($pattern, $documentPartXML, $matches, PREG_OFFSET_CAPTURE, $startpos);
                    if($searchstatus){
                        if($count == 0){
                            $startpos = $matches[2][1];
                            $endpos = $matches[3][1];
                        }else{
                            $startpos = $matches[1][1];
                            $endpos = $matches[3][1] + 6;
                        }
                        if($endpos > $endbookmark){
                            break;
                        }

                        $documentPartXML = substr($documentPartXML, 0, $startpos) . ($count == 0 ? $replace : '') . substr($documentPartXML, $endpos);
                        $endbookmark = $endbookmark - ($endpos - $startpos);

                        $count ++;
                    }

                }while($searchstatus);

            }
        }

        return $documentPartXML;
    }

Sorry for my english i'm french.

Thanks you,

Have a nice day

Ablont commented 3 months ago

EDIT 19.07.2024: Rewrote the code to be way better and now i am working with DOMDocument instead of string searching.

For anyone who has the same issue I used some of the code @xTwisten wrote (or copied from somewhere else :P) and reworked it so it is going work with empty / filled and multiline bookmarks. Also I wrote some comments and put the code inside of a custom class so you don't have to edit the base TemplateProcessor.php.

You can use this as you would use the original.

$templateProcessor = new BookmarkTemplateProcessor($filePath);
$templateProcessor->replaceBookmark('Bookmark1', 'test123');
$templateProcessor->replaceBookmark('Bookmark2', 'test456');

My implementation is not the perfect solution at all but it fits my use case. Feel free to edit/fix the code. I hope this will help somebody out. Maybe the creators will add this / use this as a base to update their library.

<?php

use DOMDocument;
use DOMXPath;
use PhpOffice\PhpWord\Escaper\Xml;
use PhpOffice\PhpWord\Settings;
use PhpOffice\PhpWord\TemplateProcessor;

class BookmarkTemplateProcessor extends TemplateProcessor
{
    /**
     * Replace or add the content of a bookmark.
     * The base of this method is copied from https://github.com/PHPOffice/PHPWord/issues/2040 (Issue from xTwisten)
     * Some variables have been renamed and the code now has comments.
     * 
     * @param string $search The name of the bookmark
     * @param array|string $replace The (new) content of the bookmark
     * @param array $options an array containing options like "ignoreStyleTags" (currently the only option available)
     */
    public function replaceBookmark(string $search, array|string $replace, array $options = [])
    {
        // ensure that the content of $replace is UTF-8 encoded
        if (is_array($replace)) {
            foreach ($replace as &$item) {
                $item = self::ensureUtf8Encoded($item);
            }
        } else {
            $replace = self::ensureUtf8Encoded($replace);
        }

        // escape $replace if requried
        if (Settings::isOutputEscapingEnabled()) {
            $xmlEscaper = new Xml();
            $replace = $xmlEscaper->escape($replace);
        }

        // replace header content
        foreach ($this->tempDocumentHeaders as $xml) {
            $xml = $this->setBookmarkForPart($search, $replace, $xml, $options);
        }

        // replace main content
        $this->tempDocumentMainPart = $this->setBookmarkForPart($search, $replace, $this->tempDocumentMainPart, $options);

        // replace footer content
        foreach ($this->tempDocumentFooters as $xml) {
            $xml = $this->setBookmarkForPart($search, $replace, $xml, $options);
        }
    }

    /**
     * Internal method that handles the replacement/insertion of the bookmark content
     * 
     * @param string $search @see replaceBookmark()
     * @param array|string $replace @see replaceBookmark()
     * @param string $content The target content (xml) - header, main content or footer (@see replaceBookmark())
     * @param array $options @see replaceBookmark()
     */
    protected function setBookmarkForPart(string $search, array|string $replace, string $content, array $options)
    {
        // convert string $replace to array
        if (!is_array($replace)) {
            $replace = [$replace];
        }

        // load xml content into a DOMDocument
        $doc = new DOMDocument;
        $doc->loadXML($content);

        // now get all bookmarkStart-nodes matching the $search in the name
        $docXPath = new DOMXPath($doc);
        $bookmarkStartElements = $docXPath->query('//w:bookmarkStart[@w:name="' . $search . '"]');

        // get parents of elements
        /** @var \DOMNode $bookmarkStart */
        foreach ($bookmarkStartElements as $bookmarkStart) {

            // check if the w:bookmarkEnd-Tag is the next sibling to the start. If NOT then we have to replace the text values
            $alwaysNewParagraph = FALSE;
            $insertAfterParagraph = NULL;
            if ($bookmarkStart->nextSibling->nodeName !== "w:bookmarkEnd") {

                // get bookmark id
                $bookmarkID = $bookmarkStart->attributes['id']->value;

                // get all w:t-tags that are between bookmarkStart and bookmarkEnd
                $tTags = $docXPath->query('//w:t[count(following::w:bookmarkEnd[@w:id="' . $bookmarkID . '"])=1 and count(preceding::w:bookmarkStart[@w:id="' . $bookmarkID . '"])=1]');

                // now loop through all the found w:t-tags and update their content
                foreach ($tTags as $idx => $tTag) {

                    // check if we have a value in our "$replace" array
                    if (isset($replace[$idx])) {
                        $tTag->nodeValue = $replace[$idx];
                    } else {
                        $tTag->nodeValue = "";
                    }
                }

                // check if we have more left in our $replace array
                // if not => we are done
                if (count($replace) <= $tTags->length) {
                    continue;
                }

                // else remove the items we already replaced and add the rest
                $replace = array_slice($replace, $tTags->length);

                // also tell the bottom logic that we always want a new paragraph
                $alwaysNewParagraph = TRUE;
                $insertAfterParagraph = $docXPath->query("./ancestor::w:p", $tTags[$tTags->length - 1])[0];
            }

            // get previous sibling
            $previousSibling = $bookmarkStart->previousSibling;

            // get the w:rPr-tag inside the w:pPr-tag which we need to match the styling
            $pprTag = $bookmarkStart->parentNode->firstChild;
            $rprTag = $docXPath->query('./w:rPr', $pprTag)[0] ?? NULL;

            // replace might be an actual array (not our fake single line array)
            foreach (array_reverse($replace) as $idx => $replaceItem) {

                // now build new element with the replacement content (or use the previous sibling if that is a w:r)
                if ($previousSibling->nodeName === "w:r") {
                    $newRTag = $previousSibling->cloneNode(true);
                    $tTags = $docXPath->query('./w:t', $newRTag);
                    if ($tTags->length > 0) {
                        $tTags[0]->nodeValue = $replaceItem;
                    } else {
                        $newRTag->appendChild($doc->createElement("w:t", $replaceItem));
                    }
                } else {
                    $newRTag = $doc->createElement("w:r");
                    $newRTag->appendChild($rprTag->cloneNode(true));
                    $newRTag->appendChild($doc->createElement("w:t", $replaceItem));
                }

                // check if we want to remove some styling tags inside of the w:r-tag
                if (count($options['ignoreStyleTags'] ?? [])) {
                    foreach ($options['ignoreStyleTags'] as $ignoreTag) {

                        // search for the tag by xpath
                        $toRemove = $docXPath->query("./" . $ignoreTag, $newRTag);
                        if ($toRemove->length) {
                            $toRemove[0]->remove();
                        }
                    }
                }

                // first item => always just add the "w:r"-tag
                // we ask for the last item because we reversed the array before!!
                if ($idx === count($replace) - 1 && !$alwaysNewParagraph) {

                    // append child directly after the bookmarkEnd-node (which is the nextSibling to the bookmarkStart-node!)
                    if ($bookmarkStart->nextSibling->nextSibling) {
                        $bookmarkStart->parentNode->insertBefore($newRTag, $bookmarkStart->nextSibling->nextSibling);
                    } else {
                        $bookmarkStart->parentNode->appendChild($newRTag);
                    }

                    // no nothing more
                    continue;
                }

                // for each additional array entry we need to add a new paragraph
                // first of all we need to get the parent "w:p"-tag (= paragraph)
                $bookmarkParagraph = $bookmarkStart->parentNode;
                while ($bookmarkParagraph->nodeName !== "w:p") {
                    $bookmarkParagraph = $bookmarkParagraph->parentNode;
                }

                // build new paragraph with the replacement content
                $newPTag = $doc->createElement("w:p");
                $newPTag->appendChild($pprTag->cloneNode(true)); // insert the w:pPr-Tag
                $newPTag->appendChild($newRTag);

                // some different handling if we already replaced other values
                if ($alwaysNewParagraph && $insertAfterParagraph) {

                    // add the new paragraph after the last one (which we defined further above)
                    if ($insertAfterParagraph->nextSibling) {
                        $insertAfterParagraph->parentNode->insertBefore($newPTag, $insertAfterParagraph->nextSibling);
                    } else {
                        $insertAfterParagraph->parentNode->appendChild($newPTag);
                    }

                    continue;
                }

                // add the new paragraph after the last one
                if ($bookmarkParagraph->nextSibling) {
                    $bookmarkParagraph->parentNode->insertBefore($newPTag, $bookmarkParagraph->nextSibling);
                } else {
                    $bookmarkParagraph->parentNode->appendChild($newPTag);
                }
            }
        }

        return $doc->saveXML();
    }
}