NPE when generating pdf from docx at XWPFDocumentVisitor.java:453

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
   PDF generation

What is the expected output? What do you see instead?
   PDF document. NPE.

What version of the product are you using? On what operating system?
   1.0.0. Win7

Please provide any additional information below.

Conversion of the attached document Doc1.docx (having only 4 lines: a title, a 
subtitle, a section title, and a normal line) results in NPE (see below). 
However, when removing the subtitle, the conversion will be successful, but the 
section title will be indented (see Doc1.pdf).

I've just read the article about the conversion of JODConverter, docx4j and 
XDocReport 
(http://angelozerr.wordpress.com/2012/12/06/how-to-convert-docxodt-to-pdfhtml-wi
th-java/) and wanted to test your software before making decision about using 
it in our product for docx to pdf conversion. However, I wasn't able to convert 
the first simple document, moreover the second try (without subtitle) resulted 
in major formatting change. Would you please investigate the NPE and formatting 
issues? Is there any information about the limitations of XDocReport usage?

Best regards,
Attila

org.apache.poi.xwpf.converter.core.XWPFConverterException: 
java.lang.NullPointerException
        at org.apache.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:59)
        at org.apache.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:37)
        at org.apache.poi.xwpf.converter.core.AbstractXWPFConverter.convert(AbstractXWPFConverter.java:45)
        ... 4 more
Caused by: java.lang.NullPointerException
        at org.apache.poi.xwpf.converter.core.XWPFDocumentVisitor.getXWPFNum(XWPFDocumentVisitor.java:453)
        at org.apache.poi.xwpf.converter.core.XWPFDocumentVisitor.getNumPr(XWPFDocumentVisitor.java:326)
        at org.apache.poi.xwpf.converter.core.XWPFDocumentVisitor.visitParagraph(XWPFDocumentVisitor.java:271)
        ... 9 more

Original issue reported on code.google.com by kovacsat...@gmail.com on 26 Mar 2013 at 11:28

Attachments:

GoogleCodeExporter commented 9 years ago

Hi Attila,

At first thank's to have attaching your docx. I have add it in your JUNit 
https://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension
%2Forg.apache.poi.xwpf.converter.pdf%2Fsrc%2Ftest%2Fresources%2Forg%2Fapache%2Fp
oi%2Fxwpf%2Fconverter%2Fcore (see Issue239.docx).

I have fixed your 2 problems and commit on Git :

 1) problem with NPE (see https://code.google.com/p/xdocreport/source/detail?r=3bb48debd9cf3ef9ba7ad8810932c88a9ba0700a). Your docx is strange because you have not numId for your numPr (first time I see that), I have checked that numId is not null.
 2) problem with indentation is fixed (the paragraph properties overrides the numbering level properties ofr indentation).

Thoses fixes will available for 1.0.1

>Is there any information about the limitations of XDocReport usage?
It's very difficult to do that because docx are very very hard format and we 
fix problems when users attach (like you) problems.

But today limitations are : 

 * shapes (wich are drawn) are not converted.
 * have some bugs with tab stop.
 * table border should be improved.

If you find bugs, don't hesitate to create issue and attache your (simply docx).

Regards Angelo

Original comment by angelo.z...@gmail.com on 26 Mar 2013 at 2:16

Changed state: Fixed

GoogleCodeExporter commented 9 years ago

Hi Angelo,

Thanks for the very quick reply/solution.

I've just found another issue. The height of page body of the generated pdf
is higher than the one of the source docx in such a way that the page body
and the footer are overlapped. I've attached the docx. The source document
has 2 pages: on the 2nd page there are 2 lines. The generated pdf has only
1 page and the lines from the 2nd page of the original docx appear in the
footer of the first page of pdf.

I can imagine that working with docx format may be very strange, so I can
accept minor format differences between the input and output. However, it
is a requirement for us that the pages of the original docx must match with
the pages of the generated pdf. Namely, page x of pdf must contain the
content of page x of docx. We have contracts in docx that can be modified
until a certain step of our process flow and after a point it should be
converted to pdf (that cannot be modified afterwards). Would you please
investigate this issue as well?

Original comment by kovacsat...@gmail.com on 26 Mar 2013 at 4:02

GoogleCodeExporter commented 9 years ago

>Thanks for the very quick reply/solution.
You are welcome

> I've just found another issue.
Please create a new issue with your attached docx you have not attached your 
docx).

You must understand that we develop XDocReport on our spare time, so I cannot 
promise you that I will fix your problem as soon as possible.

Our PDF converter is not perfect and I think you will find several problems. 
But with docx4j and JODConverter you will find problems too. Developping PDF 
converter is very hard, so I think you will not find a perfect PDF converter (I 
have tested too payed product and they have too problems)

Page number is not managed for docx (but our PDF->ODT converter manages that, I 
must see how it manages that).

Regards Angelo

Original comment by angelo.z...@gmail.com on 26 Mar 2013 at 4:32

GoogleCodeExporter commented 9 years ago

I am also facing same issue even though I am using jars of version 1.0.2

Original comment by amitabh....@gmail.com on 17 Oct 2013 at 8:46

GoogleCodeExporter commented 9 years ago

This issue fixes NPE with the attached Doc1.docx. If you have a problem with 
NPE, please attach your docx.

Original comment by angelo.z...@gmail.com on 18 Oct 2013 at 7:01

GoogleCodeExporter commented 9 years ago

I am getting the same NPE, i am using jars of version 1.0.3

Original comment by c.ya...@gmail.com on 24 Dec 2013 at 8:37

Attachments:

test1.docx

GoogleCodeExporter commented 9 years ago

Have you tried with 1.0.4 SNPASHOT?

If you have the same problem, I will try to fix it when I will find time.

Regards Angelo

Original comment by angelo.z...@gmail.com on 24 Dec 2013 at 12:48

GoogleCodeExporter commented 9 years ago

Hi guys,

I'm facing a related problem. 

- I generated a .docx (recibo.docx) file using Apple Pages (I don't have the 
Microsoft Word software here), and as so the document is possibly not 100% 
compliant. 

- Using one of the samples (DocxProjectWithVelocity2PDF.java) as a reference, I 
tried to generate a .pdf file. 

- I got the NPE at 
org.apache.poi.xwpf.converter.core.XWPFDocumentVisitor.getXWPFNum( CTNumPr 
numPr ) because my "document.numbering" is null!

- Tried to add a page number (at the footer or somewhere else in the document) 
and the problem persists.

- So, for the people that can't build a proper .docx document, 
"document.numbering == null" is a issue!

And last but no least, I'm using the 1.0.4 XDocReport version.

Original comment by jose.mar...@gmail.com on 13 Jun 2014 at 3:12

Attachments:

GoogleCodeExporter commented 9 years ago

Hi Jose,

Thank's for your clarification.

Could you please attach your docx which causes the problem please. I have tried 
to convert your recibo.docx with our online demo 
http://xdocreport-converter.opensagres.cloudbees.net/ and it works.

Thank's

Original comment by angelo.z...@gmail.com on 13 Jun 2014 at 3:19

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

Hi guys,
i am unable to convert the attached docx into pdf and am using latest 1.0.4 
libraries.Please suggest this.

Error:
org.apache.poi.xwpf.converter.core.XWPFConverterException: 
org.apache.xmlbeans.impl.values.XmlValueOutOfRangeException: Invalid integer 
value: 720.0
    at org.apache.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:59)
    at org.apache.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:37)
    at org.apache.poi.xwpf.converter.core.AbstractXWPFConverter.convert(AbstractXWPFConverter.java:45)
    at com.word.pdf.wordtopdf.createPDF(wordtopdf.java:37)
    at com.word.pdf.wordtopdf.main(wordtopdf.java:18)
Caused by: org.apache.xmlbeans.impl.values.XmlValueOutOfRangeException: Invalid 
integer value: 720.0
    at org.apache.xmlbeans.impl.values.XmlObjectBase$ValueOutOfRangeValidationContext.invalid(XmlObjectBase.java:285)
    at org.apache.xmlbeans.impl.values.JavaIntegerHolder.lex(JavaIntegerHolder.java:50)
    at org.apache.xmlbeans.impl.values.JavaIntegerHolderEx.set_text(JavaIntegerHolderEx.java:40)
    at org.apache.xmlbeans.impl.values.XmlObjectBase.update_from_wscanon_text(XmlObjectBase.java:1135)
    at org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1274)
    at org.apache.xmlbeans.impl.values.JavaIntegerHolder.bigIntegerValue(JavaIntegerHolder.java:58)
    at org.apache.xmlbeans.impl.values.XmlObjectBase.getBigIntegerValue(XmlObjectBase.java:1504)
    at org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTPageMarImpl.getHeader(Unknown Source)
    at org.apache.poi.xwpf.converter.pdf.internal.PdfMapper.visitHeader(PdfMapper.java:180)
    at org.apache.poi.xwpf.converter.pdf.internal.PdfMapper.visitHeader(PdfMapper.java:112)
    at org.apache.poi.xwpf.converter.core.XWPFDocumentVisitor.visitHeaderRef(XWPFDocumentVisitor.java:1098)
    at org.apache.poi.xwpf.converter.core.MasterPageManager.visitHeadersFooters(MasterPageManager.java:213)
    at org.apache.poi.xwpf.converter.core.MasterPageManager.addSection(MasterPageManager.java:180)
    at org.apache.poi.xwpf.converter.core.MasterPageManager.compute(MasterPageManager.java:127)
    at org.apache.poi.xwpf.converter.core.MasterPageManager.initialize(MasterPageManager.java:90)
    at org.apache.poi.xwpf.converter.core.XWPFDocumentVisitor.visitBodyElements(XWPFDocumentVisitor.java:227)
    at org.apache.poi.xwpf.converter.core.XWPFDocumentVisitor.start(XWPFDocumentVisitor.java:194)
    at org.apache.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:55)
    ... 4 more

Regards,
Kumar

Original comment by tulasi.k...@gmail.com on 14 Nov 2014 at 8:42

karurkarthi / xdocreport

NPE when generating pdf from docx at XWPFDocumentVisitor.java:453 #239