algoo / preview-generator

generates previews of files with cache management
https://pypi.org/project/preview-generator/
MIT License
228 stars 51 forks source link

Unsupported mimetype: application/vnd.openxmlformats-officedocument.wordprocessingml.document #156

Closed AtulCIS closed 4 years ago

AtulCIS commented 4 years ago

I am using preview-generator to make a pdf from docx file in Django. The code is working properly in my local machine but the same deployment is not working on the server which is ubuntu 18.04. I am having Unsupported mimetype: application/vnd.openxmlformats-officedocument.wordprocessingml.document error. I have researched about it and found libre office needs to be installed in the server and its already present on my server. can you please suggest what I need to do or how I can define path of libre office.

I am using Django and once I use the shell the file generated properly but in the the web its not working. This is the error on terminal.

gunicorn[29453]: Builder <class 'preview_generator.preview.builder.image__inkscape.ImagePreviewBuilderInkscape'> is missing a dependency: this builder requires inkscape to be available
gunicorn[29453]: Builder <class 'preview_generator.preview.builder.image__imconvert.ImagePreviewBuilderIMConvert'> is missing a dependency: this builder requires convert to be available
Builder <class 'preview_generator.preview.builder.document_generic.DocumentPreviewBuilder'> is missing a dependency: this builder requires qpdf to be available
Builder <class 'preview_generator.preview.builder.document__scribus.DocumentPreviewBuilderScribus'> is missing a dependency: this builder Builder <class 'preview_generator.preview.builder.office__libreoffice.OfficePreviewBuilderLibreoffice'> is missing a dependency: this builder requires libreoffice to be available
inkhey commented 4 years ago

Look like "libreoffice" executable isn't available from python code. Can you use same python environnement as your code and test this in a python console:

from preview_generator.preview.builder.office__libreoffice import OfficePreviewBuilderLibreoffice builder = OfficePreviewBuilderLibreoffice() builder.check_dependencies()

You can also try in bash:

which libreoffice In my own system, i get this result: /usr/bin/libreoffice

If theses commands failed, a simple hack to fix this will be to add a symlink from/usr/bin/libreoffice to real libreoffice binary.

inkhey commented 4 years ago

note: having executable/valid symlink in /usr/bin/libreoffice, should work in simple case. Thing can be handled differently:

AtulCIS commented 4 years ago

@inkhey I have executable "libreoffice". When I run code on server with python manage.py runserver its generating the PDF file properly, but if I use gunicorn to serve the site then I am having the issue mentioned. My libre office path is "/usr/bin/libreoffice". Can I give a path variable in settings file if yes then what can be the variable name.

inkhey commented 4 years ago

@inkhey I have executable "libreoffice". When I run code on server with python manage.py runserver its generating the PDF file properly, but if I use gunicorn to serve the site then I am having the issue mentioned. My libre office path is "/usr/bin/libreoffice". Can I give a path variable in settings file if yes then what can be the variable name.

Did you run python manage.py with the same user as the one used by gunicorn to run server code ? It's look like gunicorn context doesn't allow access to /usr/bin/libreoffice.

AtulCIS commented 4 years ago

[Unit] Description=Gunicorn instance to serve myproject After=network.target

[Service] User=root Group=root WorkingDirectory=/root/form_builder Environment="PATH=/root/form_builder/.county_form/bin" ExecStart=/root/form_builder/.county_form/bin/gunicorn --workers 10 --bind unix:county_form.sock --access-logfile /root/form_builder/logs/gunicorn-access.log --error-logfile /root/form_builder/logs/gunicorn-error.log --log-level warn --timeout 3600 -b 0.0.0.0:8001 --limit-request-line 0 --limit-request-field_size 0 county_form.wsgi:application

[Install] WantedBy=multi-user.target

Yes, Root user.

inkhey commented 4 years ago

[Unit] Description=Gunicorn instance to serve myproject After=network.target

[Service] User=root Group=root WorkingDirectory=/root/form_builder Environment="PATH=/root/form_builder/.county_form/bin" ExecStart=/root/form_builder/.county_form/bin/gunicorn --workers 10 --bind unix:county_form.sock --access-logfile /root/form_builder/logs/gunicorn-access.log --error-logfile /root/form_builder/logs/gunicorn-error.log --log-level warn --timeout 3600 -b 0.0.0.0:8001 --limit-request-line 0 --limit-request-field_size 0 county_form.wsgi:application

[Install] WantedBy=multi-user.target

Yes, Root user.

It's look like PATH is overriden by /root/form_builder/.county_form/bin, try add "libreoffice" symlink in this folder. If this is the issue, you will probably need to add other symlink depending on which builder you need. You may need convert(most image types), inkscape(svg), qpdf(pdf) and potentially scribus(scribus file).

AtulCIS commented 4 years ago

@inkhey Thanks. I have added symlink for all the dependencies. The error log does not have any error now but the file is not generating till now. I am having 503 error on it. If I generate a jpg file from the docx file I am having this error. Command '['libreoffice', '--headless', '--convert-to', 'pdf:writer_pdf_Export', '/root/form_builder/static/temp_folder/d5b56a7cbe93b79effdbb8a13e978f9f.pdf.docx', '--outdir', '/root/form_builder/static/temp_folder/', '-env:UserInstallation=file:///tmp/LibreOfficeConversion${USER}']' returned non-zero exit status 127

AtulCIS commented 4 years ago

This is the error log now Command '['libreoffice', '--headless', '--convert-to', 'pdf:writer_pdf_Export', '/root/form_builder/static/temp_folder/ae930030d03d31ca938f971a839238b5.pdf.docx', '--outdir', '/root/form_builder/static/temp_folder/', '-env:UserInstallation=file:///tmp/LibreOfficeConversion${USER}']' returned non-zero exit status 127

inkhey commented 4 years ago

@AtulCIS can you try run as same user as script (root in your case) the command :

libreoffice --headless --convert-to pdf:writer_pdf_Export /root/form_builder/static/temp_folder/ae930030d03d31ca938f971a839238b5.pdf.docx /root/form_builder/static/temp_folder/ -env:UserInstallation=file:///tmp/LibreOfficeConversion${USER}

and then :

echo $?

You will probably get some error after first command which can help us to know where the issue come from. second command is just to verify we get same exit status as script (we should get 127).

AtulCIS commented 4 years ago

This is the error convert /root/form_builder/static/temp_folder/ae930030d03d31ca938f971a839238b5.docx -> /root/ae930030d03d31ca938f971a839238b5.pdf using filter : writer_pdf_Export Error: source file could not be loaded

and on echo $? I am getting 0

inkhey commented 4 years ago

This is the error convert /root/form_builder/static/temp_folder/ae930030d03d31ca938f971a839238b5.docx -> /root/ae930030d03d31ca938f971a839238b5.pdf using filter : writer_pdf_Export Error: source file could not be loaded

can you check if /root/form_builder/static/temp_folder dir exist ? if no, try to create if and retest your code.

AtulCIS commented 4 years ago

yes it exists. The command you gave convert file properly with the error I gave you.

I have checked the dependencies also.

✓ ImagePreviewBuilderWand wand 0.5.7 from /root/form_builder/.county_form/lib/python3.5/site-packages/wand ✓ ImagePreviewBuilderPillow PIL 6.2.1 from /root/form_builder/.county_form/lib/python3.5/site-packages/PIL ✓ ImagePreviewBuilderIMConvert Version: ImageMagick 6.8.9-9 Q16 x86_64 2019-11-12 http://www.imagemagick.org from /root/form_builder/.county_form/bin/convert ✓ ImagePreviewBuilderInkscape Inkscape 0.91 r13725 from /root/form_builder/.county_form/bin/inkscape ✓ DocumentPreviewBuilderScribus Scribus Version 1.4.6 from /root/form_builder/.county_form/bin/scribus ✓ OfficePreviewBuilderLibreoffice LibreOffice 6.3.3.2 a64200df03143b798afd1ec74a12ab50359878ed from /root/form_builder/.county_form/bin/libreoffice ✓ PlainTextPreviewBuilder LibreOffice 6.3.3.2 a64200df03143b798afd1ec74a12ab50359878ed from /root/form_builder/.county_form/bin/libreoffice ✓ ImagePreviewBuilderVtk VTK version :8.1.2

In the folder 2 files is created when I try to create a pdf first one is with .pdf.docx extension and other one is .pdf_flag the .pdf.docx file is proper word file. but .pdf_flag file is 0 kb

As I have checked again with .docx file and this is my command libreoffice --headless --convert-to pdf:writer_pdf_Export /root/form_builder/static/temp_folder/2/157466631384162700f14f41df1e.docx --outdir /root/form_builder/static/temp_folder/2/ -env:UserInstallation=file:///tmp/LibreOfficeConversion${USER}

Output convert /root/form_builder/static/temp_folder/2/157466631384162700f14f41df1e.docx -> /root/form_builder/static/temp_folder/2/157466631384162700f14f41df1e.pdf using filter : writer_pdf_Export

This is my Django Error Page : https://prnt.sc/q2x3vz

AtulCIS commented 4 years ago

FINALLY! I have fixed the issue by removing Environment="PATH=/root/form_builder/.county_form/bin" from my gunicorn server.