JessicaTegner / pypandoc

Thin wrapper for "pandoc" (MIT)
http://pypi.python.org/pypi/pypandoc/
Other
843 stars 108 forks source link

arabic encoding support and image in header #335

Closed i-salameh95 closed 3 months ago

i-salameh95 commented 1 year ago

Hello I'm using this package to convert .docx file to .pdf file my file contain variables filled by Django view. all of this working correctly, but the .docx file have Arabic statement, so the program show error during conversion, so I remove this statement and then the code works like a charm. so how to support convert doc with Arabic words? also the .docx file has header ( with logo ), the outputted pdf has no header at all !!

@author_required
def download_approval(request, project_id):
    project = get_object_or_404(Project, pk=project_id)
    doc = DocxTemplate('letter.docx')
    context = {
        'ref_num': project.ref_num,
        'author_name': project.author.get_full_name,
        'approval_date': project.approved_date.date(),
        'project_title': project.title_en
    }
    doc.render(context)
    doc.save('approval_letter.docx')
    pypandoc.convert_file('approval_letter.docx', 'latex', outputfile="research_permission_request.pdf",
                          extra_args=['--pdf-engine=C:\\Users\\HP\\Desktop\\mktex\\miktex\\bin\\x64\\pdflatex.exe'])
    pdf = open('research_permission_request.pdf', 'rb')
    response = FileResponse(pdf)
    return response
JessicaTegner commented 1 year ago

hi @i-salameh95

Thanks for the issue, sorry it's taken a little.

First of all, can you indicate which python, pypandoc and pandoc version you are using, as that can effect the image/logo not showing up?

For Arabic support, you might want to look at jgm/pandoc#5643

i-salameh95 commented 1 year ago

Hi @JessicaTegner this is from the requirements.txt file. ( django packages)

pypandoc    v 1.11
pypandoc-binary        v 1.11
python v 3.9.2 
django v 4.2

(venv) > pandoc -v
pandoc 3.1.2
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: C:....\AppData\Roaming\pandoc
Copyright (C) 2006-2023 John MacFarlane. Web:  https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

also I have attach the docx that I wanna covert to pdf : letter.docx also this is the pdf that resulted from the conversion process: research_permission_request.pdf

note1: I don't wanna support Arabic anymore, I have translate it to English. the only thing I wanna solve is the the image.. note2: I'm running this code on windows machine.

also, does the previous code works on Nginx, Linux (ubuntu) server ?

JessicaTegner commented 1 year ago

So correct me if I'm wrong, but it seems like the images in the word documents are in the page header, which from my understanding of pandoc do not get converted as well. I did some testing with pure pandoc and the files you provided, and they indeed od not get converted over, which is really strange.

i-salameh95 commented 1 year ago

so ? how to convert the docx with the images :/ is there any workaround ?

JessicaTegner commented 3 months ago

f