convert a docx to pdf WITHOUT mergefields

GoogleCodeExporter commented 8 years ago

Hi,

Is it possible to convert a docx to pdf WITHOUT replacing the mergefields?
So let's say, convert a docx to pdf with static data?

Thanks,

Tim

Original issue reported on code.google.com by barbio....@gmail.com on 22 Oct 2012 at 11:12

GoogleCodeExporter commented 8 years ago

Hi Tim,

As Pascal told you ou can convert a docx to pdf without any mergefield.

You can try : http://xdocreport-converter.opensagres.cloudbees.net/ online...

This demo is based on XDocReport 1.0.0 which is not released. You can try the 
0.9.8 docx converter http://code.google.com/p/xdocreport/wiki/XWPFConverter but 
I suggest you to use 1.0.0 because it improve a lot the docx converter and pay 
attention because API converter will change too.

Regards Angelo

Original comment by angelo.z...@gmail.com on 22 Oct 2012 at 12:04

Changed state: Accepted

GoogleCodeExporter commented 8 years ago

Hi Angelo,

What we try to do:
We have our own merge code that merges all mergefields from the template docx 
and results in a new docx where all fields are merged.
We convert the output of this method to an inputstream and pass it to your 
converter to receive a pdf of this generated docx.
When we do this with the code below, we receive the following error:

org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package should 
contain a content type part [M1.13]

We use 1.0.0-SNAPSHOT and PDFViaITextOptions is no longer provided?

This is our current code:
documentMergerService.merge(template, replacementValues, null, null, null, 
result, DocumentType.DOCX);

ByteArrayInputStream resultForOutput = new 
ByteArrayInputStream(((ByteArrayOutputStream) result).toByteArray());

Options options = Options.getTo(ConverterTypeTo.PDF).via(ConverterTypeVia.XWPF);
XWPF2PDFViaITextConverter.getInstance().convert(resultForOutput, out, options);

Thanks,

Tim

Original comment by barbio....@gmail.com on 23 Oct 2012 at 7:16

GoogleCodeExporter commented 8 years ago

Hi Tim,

At first have you tried with our live demo 
http://xdocreport-converter.opensagres.cloudbees.net?

> We have our own merge code that merges all mergefields from the template docx
Just one question, why you don't use XDocReport for that? Your solution is it 
more powerfull? It should be cool if we share our experiences about this topic.

I know that our Wiki about converter is out of date for 1.0.0 which is not 
released. I will update it as soon as we will do the release.

The 1.0.0 will provide sample with docx->pdf converter in th eproject 
http://code.google.com/p/xdocreport/source/browse?repo=samples#git%2Fsamples%2Ff
r.opensagres.xdocreport.samples.docx.converters 

The easy mean to get the NOT relesead 1.0.0 is to use maven :

1) add repository in your pom.xml

-----------------------------------------------------------------
<repositories>
   <repository>
    <id>sonatype</id>
    <url>http://oss.sonatype.org/content/repositories/snapshots/</url>
   </repository>
</repositories>
-----------------------------------------------------------------

2) add dependency pom.xml
-----------------------------------------------------------------
<dependency>
  <groupId>fr.opensagres.xdocreport</groupId>
  <artifactId>org.apache.poi.xwpf.converter.pdf</artifactId>      
  <version>1.0.0-SNAPSHOT</version>               
</dependency>
-----------------------------------------------------------------

The docx converter API changes in the 1.0.0 : 

*XWPF2PDFViaITextConverter is replaced by PdfConverter.
*PDFViaITextOptions is replaced by PdfOptions.

Your error org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package 
should contain a content type part [M1.13] comes from Apache POI and it seems 
that your docx  is not well generated (or perhaps it's a bug with POI, so 
please send message to POI forum).

I suggest you to load your docx with docx4j to test it. You can use our live 
demo http://xdocreport-converter.opensagres.cloudbees.net/ where you can select 
docx4j which provides too a docx->pdf converter.

Could you please attach your docx which causes the problem.

Many thank's

Regards Angelo

Original comment by angelo.z...@gmail.com on 23 Oct 2012 at 8:14

GoogleCodeExporter commented 8 years ago

Tim, 

Sorry for my mistake when I said: 

*XWPF2PDFViaITextConverter is replaced by PdfConverter.
*PDFViaITextOptions is replaced by PdfOptions.

It seems that you are using 
fr.opensagres.xdocreport.converter.docx.poi.itext.XWPF2PDFViaITextConverter 
which is used to register a converter in our generic ConverterRegistry 
(regrsiter odt, docx converters etc).

In your case you wish use directly docx, so I suggest you to use directly 
org.apache.poi.xwpf.converter.pdf.PdfConverter

It seems a guy have the same problem than you with 
POI:http://apache-poi.1045710.n5.nabble.com/InvalidOperationException-Can-t-open
-specified-file-td5524067.html

It seems that it has not closed correctly the InputStream when it loads it with 
POI.So try to close the InputStream after loading the XWPFDocument. If it works 
I will fix that too in our 
fr.opensagres.xdocreport.converter.docx.poi.itext.XWPF2PDFViaITextConverter

Regards Angelo

Original comment by angelo.z...@gmail.com on 23 Oct 2012 at 8:27

GoogleCodeExporter commented 8 years ago

Angelo,

The reason why we prefer to merge the fields ourself is because by doing that 
we don't need to review all our doxc templates. Because Velocity needs a $-sign 
for all merge fields AND is case sensitive for the names of the mergefields. 
Our templates don't have the dollar signs and the fields are not case 
sensitive...

Nevertheless, I managed to solve most of our "problems" and I'm now able to do 
the merge ourself and convert it to pdf with the following code and version 
1.0.0-SNAPSHOT:
   XWPFDocument document = new XWPFDocument(resultForOutput)
   PdfOptions options = PdfOptions.create()
   PdfConverter.getInstance().convert(document, result, options)

But, as discussed in issue 174, I have now again the row height problem when I 
convert via PdfConverter.getInstance().convert(document, result, options)
Is it possible that even in version 1.0.0-SNAPSHOT this issue is still there 
and was only soved in the code report.convert(context, options, result)??

Thanks,
Tim

Original comment by barbio....@gmail.com on 24 Oct 2012 at 9:03

GoogleCodeExporter commented 8 years ago

>The reason why we prefer to merge the fields ourself is because by doing that 
we don't >need to review all our doxc templates. Because Velocity needs a 
$-sign for all merge >fields AND is case sensitive for the names of the 
mergefields. Our templates don't >have the dollar signs and the fields are not 
case sensitive...

Ok I understand. That's shame for us-( But I ,would like improve XDocReport to 
manage any syntax (like docmosis). The idea is to set a SyntaxFormater when 
report is loaded and SyntaxtFormater replace custom syntax with 
Velocity/Freemarker syntax. With this feature you could use our syntax. But 
it's just an idea.

>But, as discussed in issue 174, I have now again the row height problem when I 
>convert via PdfConverter.getInstance().convert(document, result, options)

Please attach your docx, I cannot help you if I can debug the docx converter.

>Is it possible that even in version 1.0.0-SNAPSHOT this issue is still there 
and was >only soved in the code report.convert(context, options, result)

report.convert(context, options, result) uses PdfConverter. So if PdfConverter 
doesn't work, report.convert(context, options, result) doesn't work.

So please attach your docx to try to fix the problem.

Many thank's

Regards Angelo

Original comment by angelo.z...@gmail.com on 24 Oct 2012 at 9:11

GoogleCodeExporter commented 8 years ago

Angelo,

Here you have a small part of our docx with the troubling parts.
I also attached the resulting pdf that comes from the docx.

You can see that I even set the table properties for the rows, but the 
resulting pdf has rows that are too big
The borders in the table I just added for testing and to better visualize the 
result.

Tim

Original comment by barbio....@gmail.com on 24 Oct 2012 at 9:28

Attachments:

GoogleCodeExporter commented 8 years ago

Hi Tim,

I have fixed your problem with table height. However it exists again 2 problems 
with your docx: 

1) the inside table width which is greater than the owner cell cut this cell (I 
don't know how to manage that with iText?)
2) the logo header appears, why?

Regards Angelo

Original comment by angelo.z...@gmail.com on 31 Oct 2012 at 4:58

dbarra / xdocreport

convert a docx to pdf WITHOUT mergefields #175