openlabs / docker-wkhtmltopdf-aas

wkhtmltopdf in a docker container as a web service.
BSD 3-Clause "New" or "Revised" License
99 stars 94 forks source link

JSON example does not work in Python3 due to encoding differences #9

Open FraserThompson opened 9 years ago

FraserThompson commented 9 years ago

In Python3 encoding to base64 is done with base64.b64encode which creates a bytes object. Json.dumps only takes a string, so the example produces an error.

Attempting to read the file as utf-8 doesn't produce an error, but the resulting PDF is garbled because I assume wkhtmltopdf is expecting a base64 encoded HTML string? Also giving it the --encoding utf-8 option still produces a garbled PDF.

Basically I can't figure out how to get the JSON API working in Python3.

dtoso-skymesh commented 9 years ago

the resulting PDF is garbled because I assume wkhtmltopdf is expecting a base64 encoded HTML string

On line 34 of app.py, the JSON mode pulls contents out of the POSTed JSON and does a Base64 decode before writing that out to a temporary file. Character sets don't appear to come into it unless you aren't sending valid JSON.

On the service-side, the recipe is built from openlabs/docker-wkhtmltopdf, which in turn is built from ubuntu:14.04 and the referenced python executable is not Python3. At time I'm writing this, the image I get via docker run has :

docker exec /usr/bin/env python -V

Python 2.7.6

Are you attempting to modify the image? Perhaps you're rebuilding the recipe from something other than ubuntu:14.04? If so, you'll likely need to modify a fair bit to get it working.

On the client-side, as long as you (1) send valid JSON and as long as (2) contents is Base64 encoded, it'll work just fine.

FraserThompson commented 9 years ago

I was referring to what happens on the clientside, specifically this example code in the README:

import requests

url = 'http://<docker_host>:<port>/'
data = {
    'contents': open('/file/to/convert.html').read().encode('base64'),
}
headers = {
    'Content-Type': 'application/json',    # This is important
}
response = requests.post(url, data=json.dumps(data), headers=headers)

# Save the response contents to a file
with open('/path/to/local/file.pdf', 'wb') as f:
    f.write(response.content)

This code doesn't work in Python 3 because json.dumps won't dump base64 encoded data to a JSON object. I couldn't figure out how to read the contents of the file and getting it into a JSON object in such a way that the docker container accepted it in Python 3 and just ended up doing it in Python 2.7 instead.

sharoonthomas commented 9 years ago

Could you try this please ?

data = {
    'contents': open('/file/to/convert.html').read().encode('base64').decode('utf-8'),
}
richardstrnad commented 8 years ago

As i had the same issue, this worked for me with python3:

import json
import requests
from base64 import b64encode

url = 'http://0.0.0.0:4133'
input_html = 'input.html'
encoding = 'utf-8'

with open(input_html, 'rb') as open_file:
    byte_content = open_file.read()

base64_bytes = b64encode(byte_content)
base64_string = base64_bytes.decode(encoding)

data = {
    'contents': base64_string,
}
headers = {
    'Content-Type': 'application/json',    # This is important
}
response = requests.post(url, data=json.dumps(data), headers=headers)

# Save the response contents to a file
with open('out.pdf', 'wb') as f:
    f.write(response.content)

This Link helped: Stackoverflow

RESHNAM commented 5 years ago

Thank you @sharoonthomas . That helped me alot