Closed rpinquie closed 8 years ago
You would use the IConverter
API for this such as for any conversion. The bridge implementations are meant to be used by the converter API under the covers, i.e. make sure that the MS Word conversion bridge is added to the class path, the LocalConverter
then automatically discovers it:
File wordFile = new File( ... ), target = new File( ... );
IConverter converter = LocalConveter.make();
Future<Boolean> conversion = converter
.convert(pdfFile).as(DocumentType.PDF)
.to(wordFile).as(DocumentType.DOCX)
.schedule();
Hi, I followed the above code for converting from pdf to .docx on my local machine, but the format is not exactly same, I could see some distortion in the converted file, meaning the header alignments are not same. Could you pls help on how can I get the exact formar with out any distortion My code: public class LocalConversion { static String sourcsFile="C:/PDF2DOC/FirmOrderLetter_MYR.pdf"; static String targetFile="C:/PDF2DOC/FirmOrderLetter_MYR.docx";
public static void main (String args[]) {
File pdfFile = new File(sourcsFile), target = new File(targetFile);
IConverter converter = LocalConverter.make();
Future
Hei, what do you mean with distorted? The conversion is applied from the application running in the background. Don't you get the same results when importing the PDF directly in Word? From PDF conversions are a bit tricky to begin with,
Thanks for your response, the alignment of header is not same as in the pdf document after converting into word version. have attached the files FYR, in the converted version (docx) the left header was not aligned to right header. FYI, right header is an image.
shaded out the wordings for compliance issues.
Have you tried converting the document using MS Word manually? If the same behavior dispays, there is not much that documents4j can do differently.
yes converting manually also same problem, alignment is not in order, How to achieve this the other way
This is a question for the MS Word team or a related group that I cannot answer.
I mean any other way of converting pdf to docx with exact alignments/format using documents4j ?
documents4j is using MS Word underneath. It cannot do anything that Word is incapable of.
Thank you, That means documents4j conversion works fine for docx to pdf conversion but not the otherway (pdf to docx). Thank you so much for your prompt responses. much appreciated.
what are all jars needs to be added from documents4j
Maven can list them via dependency:list
.
Hello, after converting the documents, I can not delete the files that have been generated because they are being used by a process. How could you close those procedures in order to eliminate them? Thank you.
The locks should be released after conversion. Did you check what process is holding the locks?
Hello @raphw. I'm trying to use the api to transform PDF to DOCX.
This is the implementation, pretty simple I think... also using word its transforming the file as expected. I will attach a picture with the output of the conversion:
PDF to DOCX is a strange process. Does the same happen if you open the PDF in Word direclty? documents4j just delegates the job, so this might just be the outcome.
Yes, works perfectly. Do you know any other libraries that could get the job done if documents4j seems not to work on this one?
You can have a look at the conversion vbs file that you find in the word-bridge jar file. Maybe you need to adjust this file?
documents4j offers a system property to use your own VBS file instead of the one that ships with documents4j, so maybe this can solve your problem?
@rafaeljigau Hello, I have same issue. Do you have solved this problem or found an another library ?
Hi, I followed the example given and I'm getting the following error when converting PDF to DOCX. Is there any dependency missing?
com.documents4j.throwables.ConversionInputException: No converter for conversion of application/pdf to application/vnd.openxmlformats-officedocument.wordprocessingml.document available at com.documents4j.conversion.ConverterRegistry.lookup(ConverterRegistry.java:65) at com.documents4j.conversion.DefaultConversionManager.startConversion(DefaultConversionManager.java:30) at com.documents4j.job.LocalFutureWrappingPriorityFuture.startConversion(LocalFutureWrappingPriorityFuture.java:50) at com.documents4j.job.LocalFutureWrappingPriorityFuture.startConversion(LocalFutureWrappingPriorityFuture.java:11) at com.documents4j.job.AbstractFutureWrappingPriorityFuture.run(AbstractFutureWrappingPriorityFuture.java:70) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829)
No, I do not think Word is capable of transforming PDF to Word format, just the other way around.
pdf2document.com - No loss of pdf layout 30 free page conversions daily for all users.
Hi,
I cannot find any example that shows how to convert a PDF whose native format is DOCX into the original DOCX? I guess I should use the MicrosoftWordBridge converter, but can't figure out how...
Cheers,