seleniumbase / SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
https://seleniumbase.io
MIT License
5.16k stars 958 forks source link

Adding the ability to overwrite the downloaded_files folder? #2891

Closed landonb629 closed 3 months ago

landonb629 commented 3 months ago

I have seleniumbase working on an aws lambda function as a container, Lambda's have only one writable directory "/tmp" without adding another service as a file system, and right now I keep getting errors because seleniumbase is trying to put files in "downloaded_files", would it be an issue to add the option to allow setting the downloads directory?

example

with SB(download_dir="/tmp/example"):

mdmintz commented 3 months ago

Hello, That folder needs to be created in the location where the script is invoked. If permissions are missing for creating a folder there, then you would run into the same issue if you tried changing the name of the folder that gets created. I'd set things up so that the script has permissions to create folders and files within the folder where the script is invoked.


Also, duplicate of https://github.com/seleniumbase/SeleniumBase/issues/2479#issuecomment-1936854189


SeleniumBase creates the downloaded_files folder in the working directory where pytest is invoked. Any click-initiated downloads will go there. It's also used for any special files needed. It also holds lock files to prevent issues with multi-threading. The folder resets at the start of every new pytest run so that past test runs don't interfere with new ones. The folder is hard-coded there to prevent issues. And since the folder is reset at the start of new pytest runs, you wouldn't want to use any other existing folders for it.

There are lots of built-in test methods that are specially made for that folder, such as:

self.get_downloads_folder()

self.get_browser_downloads_folder()

self.get_path_of_downloaded_file(file)

self.is_downloaded_file_present(file)

self.delete_downloaded_file_if_present(file)
# Duplicates: self.delete_downloaded_file(file)

self.assert_downloaded_file(file)

self.get_downloaded_files(regex=None, browser=False)

self.get_data_from_downloaded_file(file, timeout=None, browser=False)

self.assert_data_in_downloaded_file(data, file, timeout=None, browser=False)

If you need files in a different folder, use Python os, sys, and shutil libraries for copying the files you need into a different folder after they get downloaded to the downloaded_files folder.

There's also a pytest command-line option to archive existing downloads instead of deleting them: https://github.com/seleniumbase/SeleniumBase/blob/570910c6f6e079cddd00c23e826e202c6489f39e/README.md?plain=1#L673

(A copy of those downloads will go into a folder in the archived_files/ folder at the end of the pytest run.)