adieuadieu / serverless-chrome

🌐 Run headless Chrome/Chromium on AWS Lambda
MIT License
2.85k stars 281 forks source link

File doesn't download in headless chrome #352

Open rishabhsiitk opened 1 month ago

rishabhsiitk commented 1 month ago

I am using docker image public.ecr.aws/sam/build-python3.7 with following configuration selenium==4.0.0a5 urllib3==1.26.6 https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip

and below is the code

from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service

opts = Options()
opts.binary_location = '/opt/headless-chromium'
opts.add_argument("--headless")
opts.add_argument('--no-sandbox')
opts.add_argument("−−incognito")
opts.add_argument("--disable-dev-shm-usage")
opts.add_argument("--single-process")
opts.add_argument('--start-maximized')
opts.add_argument('--start-fullscreen')
opts.add_argument("--window-size=1499x2200")
opts.add_experimental_option('prefs', {
"download.default_directory": '/tmp',
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True
})

svc = Service(executable_path='/opt/chromedriver')
browser = Chrome(service=svc, options=opts)

# any file/pdf download link, giving a sample
browser.get("http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679")

file is not being download neither in /tmp(passed as default download directory) nor anywhere. Can someone guide/help with this issue?

rishabhsiitk commented 1 month ago

@umihico @aleksandar-devedzic I see you guys active here. Can you please help?

umihico commented 1 month ago

@rishabhsiitk In local chrome on your computer, you can download file with this code?

rishabhsiitk commented 1 month ago

@umihico

@rishabhsiitk In local chrome on your computer, you can download file with this code?

Yeah, i am able to download it on my local but i am not using headless-chrome instead the local chrome with below code

opts = Options()
opts.add_argument("--headless")
opts.add_argument("−−incognito")
opts.add_experimental_option('prefs', {
"download.default_directory": os.getcwd(),
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True
})
browser = Chrome(options=opts)
umihico commented 1 month ago

how about trying local again WITH the options? The diffs between envs should be minimized to debug.

Also, you can try my repository instead of using this repository. It may work if this headless chrome is broken some how. the version of headless chrome is really old nowadays.

https://github.com/umihico/docker-selenium-lambda/

rishabhsiitk commented 1 month ago

@umihico on my local, it's working with below option

opts = Options()
opts.add_argument("--headless")
opts.add_argument('--no-sandbox')
opts.add_argument("−−incognito")
opts.add_argument("--disable-dev-shm-usage")
opts.add_argument("--single-process")
opts.add_argument('--start-maximized')
opts.add_argument('--start-fullscreen')
opts.add_argument("--window-size=1499x2200")
opts.add_experimental_option('prefs', {
"download.default_directory": os.getcwd(),
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True
})
browser = Chrome(options=opts)
rishabhsiitk commented 1 month ago

@umihico Below are my local configuration. python==3.9 selenium==4.20.0

chrome browser is - Version 125.0.6422.141 (Official Build) (arm64)