danvergara / morphos

Self-hosted file converter server
MIT License
1.06k stars 44 forks source link

DOCX-PDF Conversion #40

Closed danvergara closed 6 months ago

danvergara commented 7 months ago

DOCX-PDF

Description

This PR introduces the DOCX-PDF conversion and that means that is possible to convert PDF files to DOCX and the otherwise is also true. This one was though, because there's no easy way to do this by just using Go or OSS, the best solution out there was using UniDOC but it's proprietary software.

So looking around, I found out that how other projects worked around this limitation, it's basically calling libreoffice. Go's standard library provides a package called os/exec which allows us to runs external commands and that's what I needed to work with libreoffice.

DOCX -> PDF

libreoffice --headless --convert-to pdf:writer_pdf_Export --outdir <out_dir> <foo.docx>

PDF -> DOCX

libreoffice --headless --infilter='writer_pdf_import' --convert-to docx:"MS Word 2007 XML" --outdir <out_dir> <foo.pdf>

Type of change

Please delete options that are not relevant.

How Has This Been Tested?

QA'd this change locally with multiple files and I added more tests.

Checklist:

danvergara commented 6 months ago

Did a visual check of the code, looks good. Not so good that you have to add libreoffice just to support this.

Libreoffice was the only Open Source option available. The other one was unidoc which is not Open Source.