phip1611 / docx4j-search-and-replace-util

Docx4JSRUtil library helps you to search and replace text inside docx-Documents parsed by Docx4J.
MIT License
21 stars 11 forks source link

File corrupted after text replace #13

Open metallica33 opened 1 year ago

metallica33 commented 1 year ago

I am using this utility to replace the text in the docx file. If I try to save the docx file then it will not open in the MS Word. It gives an error as shown in the attached screenshot. But converting to PDF using Docx4j.toPDF creates the PDF file correctly.

Here is the code -

try {
    InputStream templateInputStream = new FileInputStream("C:/Documents/original.docx");
    WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(templateInputStream);
    String regex = ".*(calibri|cour|arial|times|comic|georgia|impact|LSANS|pala|tahoma|trebuc|verdana|symbol|webdings|wingding).*";
    PhysicalFonts.setRegex(regex);
    Map < String, String > replaceMap = new HashMap < String, String > ();
    replaceMap.put("<<user_name>>", "Jon Doe");
    replaceMap.put("<<user_email>>", "jon_doe@example.com");
    Docx4JSRUtil.searchAndReplace(wordMLPackage, replaceMap); //If I comment this, then the docx file is saved correctly
    MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
    FileOutputStream pdfOs = new FileOutputStream("C:/Documents/document.pdf");
    FileOutputStream docxOs = new FileOutputStream("C:/Documents/document.docx");
    Docx4J.save(wordMLPackage, docxOs); //docx file saved does not open in MS Word
    Docx4J.toPDF(wordMLPackage, pdfOs); //PDF file is created correctly             
    pdfOs.flush();
    pdfOs.close();
    docxOs.flush();
    docxOs.close();
} catch (Docx4JException | IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}
Screenshot 2023-03-24 224328
phip1611 commented 1 year ago

Hi @metallica33 , I never experienced this.

metallica33 commented 1 year ago

docx4j-search-and-replace-util - v1.0.7 docx4j - v8 MS Word 365

phip1611 commented 1 year ago

Could you share the docx file with me and give me instructions for a minimal producer, please?

metallica33 commented 1 year ago

I have attached the files here. The document.docx is the file which is saved after conversion. Just run the code provided earlier and it will create the documen.pdf and document.docx files.

document.docx document.pdf original.docx

phip1611 commented 1 year ago

I did not have time so far to look into this, sorry. I hope sometime in the next few days.

ThiagoDosSantos commented 1 year ago

I was facing this same issue. I was using docx4j v6.1.2. After a couple of google searches, I ended up using these libraries:

<dependency>
    <groupId>org.docx4j</groupId>
    <artifactId>docx4j-core</artifactId>
    <version>8.3.9</version>
</dependency>
<dependency>
    <groupId>org.docx4j</groupId>
    <artifactId>docx4j-export-fo</artifactId>
    <version>8.3.9</version>
</dependency>
<dependency>
    <groupId>org.docx4j</groupId>
    <artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
    <version>8.3.9</version>
</dependency>
<dependency>
    <groupId>jakarta.xml.bind</groupId>
    <artifactId>jakarta.xml.bind-api</artifactId>
    <version>4.0.0</version>
</dependency>
<dependency>
    <groupId>org.glassfish.jaxb</groupId>
    <artifactId>jaxb-runtime</artifactId>
    <version>4.0.3</version>
</dependency>
<dependency>
    <groupId>de.phip1611</groupId>
    <artifactId>docx4j-search-and-replace-util</artifactId>
    <version>1.0.7</version>
</dependency>
phip1611 commented 9 months ago

Sorry, I don't have the capacity to investigate this.