Open FraserThompson opened 9 years ago
the resulting PDF is garbled because I assume wkhtmltopdf is expecting a base64 encoded HTML string
On line 34 of app.py
, the JSON mode pulls contents
out of the POSTed JSON and does a Base64 decode before writing that out to a temporary file. Character sets don't appear to come into it unless you aren't sending valid JSON.
On the service-side, the recipe is built from openlabs/docker-wkhtmltopdf
, which in turn is built from ubuntu:14.04
and the referenced python executable is not Python3. At time I'm writing this, the image I get via docker run
has :
docker exec
/usr/bin/env python -V Python 2.7.6
Are you attempting to modify the image? Perhaps you're rebuilding the recipe from something other than ubuntu:14.04
? If so, you'll likely need to modify a fair bit to get it working.
On the client-side, as long as you (1) send valid JSON and as long as (2) contents
is Base64 encoded, it'll work just fine.
I was referring to what happens on the clientside, specifically this example code in the README:
import requests
url = 'http://<docker_host>:<port>/'
data = {
'contents': open('/file/to/convert.html').read().encode('base64'),
}
headers = {
'Content-Type': 'application/json', # This is important
}
response = requests.post(url, data=json.dumps(data), headers=headers)
# Save the response contents to a file
with open('/path/to/local/file.pdf', 'wb') as f:
f.write(response.content)
This code doesn't work in Python 3 because json.dumps won't dump base64 encoded data to a JSON object. I couldn't figure out how to read the contents of the file and getting it into a JSON object in such a way that the docker container accepted it in Python 3 and just ended up doing it in Python 2.7 instead.
Could you try this please ?
data = {
'contents': open('/file/to/convert.html').read().encode('base64').decode('utf-8'),
}
As i had the same issue, this worked for me with python3:
import json
import requests
from base64 import b64encode
url = 'http://0.0.0.0:4133'
input_html = 'input.html'
encoding = 'utf-8'
with open(input_html, 'rb') as open_file:
byte_content = open_file.read()
base64_bytes = b64encode(byte_content)
base64_string = base64_bytes.decode(encoding)
data = {
'contents': base64_string,
}
headers = {
'Content-Type': 'application/json', # This is important
}
response = requests.post(url, data=json.dumps(data), headers=headers)
# Save the response contents to a file
with open('out.pdf', 'wb') as f:
f.write(response.content)
This Link helped: Stackoverflow
Thank you @sharoonthomas . That helped me alot
In Python3 encoding to base64 is done with base64.b64encode which creates a bytes object. Json.dumps only takes a string, so the example produces an error.
Attempting to read the file as utf-8 doesn't produce an error, but the resulting PDF is garbled because I assume wkhtmltopdf is expecting a base64 encoded HTML string? Also giving it the --encoding utf-8 option still produces a garbled PDF.
Basically I can't figure out how to get the JSON API working in Python3.