opensagres / xdocreport

XDocReport means XML Document reporting. It's Java API to merge XML document created with MS Office (docx) or OpenOffice (odt), LibreOffice (odt) with a Java model to generate report and convert it if you need to another format (PDF, XHTML...).
https://github.com/opensagres/xdocreport
1.22k stars 372 forks source link

Chinese numbering list can not render correctly #631

Open aass5050 opened 10 months ago

aass5050 commented 10 months ago

My docx is use Chinese numbering list like 一. 二. 三.. When I convert docx to pdf, the numbers are converted to 1. 2. 3. ...

this is my docx and convert pdf result

Docx:

截圖 2023-11-15 下午5 00 11

Pdf:

截圖 2023-11-15 下午5 14 49

from the PDF result can see the numbering list have much more spacing (have any method to reduce the pdf numbering spacing ??) and Chinese numbering was converted to number should I set the PdfOptions or should add other step? here is my converter code:

PdfUtilTest.java

public class PdfUtilTest {
    @Test
    void test() {

        Path path = Path.of(System.getProperty("user.dir"), "../Local Ignore", "test.docx");
        Path outputPath = Path.of(System.getProperty("user.dir"), "../Local Ignore", "outTest.pdf");

        try (FileInputStream is = new FileInputStream(path.toFile());
             FileOutputStream os = new FileOutputStream(outputPath.toFile())
        ) {
            XWPFDocument xwpfDocument = new XWPFDocument(is);
            PdfUtil.convertToPDF(xwpfDocument, os);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
}

PdfUtil.java

private static final byte[] FONT_DATA;

static {
      try {
          FONT_DATA = WordTemplate.class.getResourceAsStream("/fonts/NotoSansTC-Regular.ttf").readAllBytes();
      } catch (IOException e) {
          throw new RuntimeException(e);
      }
  }

  private static final IFontProvider CHINESE_FONT_PROVIDER = (familyName, encoding, size, style, color) -> {
      try {
          var bf = BaseFont.createFont("NotoSansTC-Regular.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED, true, FONT_DATA, null);
          var font = new Font(bf, size, style, color);
          if (familyName != null) {
              font.setFamily(familyName);
          }

          return font;
      } catch (IOException e) {
          log.error("Initial Chinese font provider error", e);
          return ITextFontRegistry.getRegistry().getFont(familyName, encoding, size, style, color);
      } catch (DocumentException e) {
          throw new RuntimeException(e);
      }
  };

public static void convertToPDF(XWPFDocument xwpfDocument, OutputStream outputStream) throws IOException {
    PdfOptions pdfOptions = PdfOptions.create();
    pdfOptions.fontProvider(CHINESE_FONT_PROVIDER);
    PdfConverter.getInstance().convert(xwpfDocument, outputStream, pdfOptions);
    outputStream.close();
}

test.docx

Font Google Noto Sans Traditional Chinese