pallets / flask

The Python micro framework for building web applications.
https://flask.palletsprojects.com
BSD 3-Clause "New" or "Revised" License
68.08k stars 16.22k forks source link

Binary file corrupted between the server and the client #2048

Closed Gallaecio closed 8 years ago

Gallaecio commented 8 years ago

I am using Python 3 and the following code:

@app.route('/')
def pdf():
    # [Generate PDF as a byte string]
    with open('file.pdf', 'wb') as f:
        f.write(pdf)
    return send_file(BytesIO(pdf))

On the server, a valid PDF file is written. On the client, I get an invalid PDF about twice the expected size.

it looks like an encoding issue, so I am guessing that the fact that I am using Python 3 may be relevant. The differences between the file saved on the server and the one I get on the client is similar to that of this StackOverflow question, but the poster there was passing the PDF data as a string, whereas in my code pdf is made of bytes (Python 3 would not let me write the file otherwise).

RonnyPfannschmidt commented 8 years ago

if you do have a bytestring, why not just return it ?!

untitaker commented 8 years ago

You're possibly having multiple requests writing to the same filepath...

Gallaecio commented 8 years ago

If I just return the bytestring, it works for a “wget URL -O file.pdf” call, but since I am not setting the Content-Type header, it does not display correctly in a web browser.

And if I use make_response (as follows), I get the same behavior as with send_file.

response = make_response(pdf)
response.headers['Content-Type'] = 'application/pdf'
return response

Since returning the bytestring “works” and I am using a REST client, I don’t think I am writing to the same file twice. The contents of the PDF are not just twice the size, they are corrupted in a similar fashion to that described in the linked StackOverflow question.

RonnyPfannschmidt commented 8 years ago

that sounds like a potential decode/re-encode issue, more context is needed to determine the location of the cause

Gallaecio commented 8 years ago

I am debugging the issue, and so far data remains intact (as b'…', and content verified by writing it to a file with the debugger) at least until self.wfile.write(data) in werkzeug/serving.py:165.

Gallaecio commented 8 years ago

more context is needed to determine the location of the cause

Any hints on what additional data may prove helpful?

untitaker commented 8 years ago

On Tue, Oct 04, 2016 at 03:38:30AM -0700, Gallaecio wrote:

If I just return the bytestring, it works for a “wget URL -O file.pdf” call, but since I am not setting the Content-Type header, it does not display correctly in a web browser.

Then set the content-type header. If there is documentation that is omitting that, it is a problem with the docs.

I don’t think I am writing to the same file twice.

You have a hardcoded temporary path in your example. You are writing to the same file.

Since returning the bytestring “works” and I am using a REST client

REST-clients are irrelevant to this issue. The fact that using the bytestring works is an indication that it has something to do with the file.

just twice the size, they are corrupted in a similar fashion to that described in the linked StackOverflow question.

There is also an answer to the question, which indicates that this is not an issue with Flask at all.

You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/pallets/flask/issues/2048#issuecomment-251354050

Gallaecio commented 8 years ago

As I was trying to write a self-contained example I noticed that I was not allowed to start a server on 5000 even after killing the existing one. After I killed even the Python process itself, and restarted the server, the make_response approach started working.

I have no idea what the issue was. The write function you see in the original post was not the problem, it only writes the contents of pdf (correctly) on the server side, I just added that in order to check that the contents of pdf where not corrupt at that point. The results were the same without the write function (there was no write function there when the problem presented itself first).

Sorry for the trouble, and thanks for the help.