psf / requests

A simple, yet elegant, HTTP library.
https://requests.readthedocs.io/en/latest/
Apache License 2.0
52.19k stars 9.33k forks source link

PDF download is distorted using requests #6754

Closed Puris2 closed 4 months ago

Puris2 commented 4 months ago

Expected Result

The PDF should download without any distortions, similar to how it downloads when using Chrome directly.

Actual Result

When downloading the PDF from the specified URL using the requests library in Python, the resulting file is distorted. There are lines across the pages and some blank pages. However, downloading the same PDF directly from Chrome results in a perfect file.

Reproduction Steps

import requests

headers = {
    'Accept': 'application/pdf',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36',,
}

params = {
    'doc': '3150239',
}

response = requests.get('https://adxservices.adx.ae/cdn/contentdownload.aspx', params=params,
[Uploading document.pdf…]()
 headers=headers)

# Check if the response is a PDF
if response.headers.get('Content-Type') == 'application/pdf':
    # Save the content as a PDF file
    with open('document.pdf', 'wb') as file:
        file.write(response.content)
    print("PDF downloaded successfully.")
else:
    print(f"Unexpected content type: {response.headers.get('Content-Type')}")
    with open('response.html', 'wb') as file:
        file.write(response.content)
    print("Response saved as HTML for inspection.")

System Information

$ python -m requests.help
{
  "paste": "here"
}
sigmavirus24 commented 4 months ago

Hi there! Thanks for opening this issue. Unfortunately, it seems this is a request for help instead of a report of a defect in the project. Please use StackOverflow for general usage questions instead and only report defects here.