wkentaro / gdown

Google Drive Public File Downloader when Curl/Wget Fails
MIT License
4.21k stars 348 forks source link

Unsanitized filenames #340

Open OmarAHex opened 6 months ago

OmarAHex commented 6 months ago

Provide environment information

Python 3.11.5 on windows

What OS are you using?

Windows 11

Describe the Bug

When downloading a folder, if any of the filenames in the folder contains an asterisk, gdown attempts to create a tempfile without changing it's base name to remove invalid characters such as \/:?<>| This is the error message OSError: [Errno 22] Invalid argument: 'C:\Users\....filename.pdfo9gwxp5ltmp'

I'm specifying the tempfile because that's where the error pops up, not sure if the final file would have the same problem.

Expected Behavior

No response

To Reproduce

No response

OmarAHex commented 6 months ago

Another note, google drive allows slashes in file names, meaning by the time the download function receives the output path it is impossible to tell whether that slash is part of a file name or represents a folder, so atleast some part of the sanitization would need to happen in the download_folder file. (The error raised here is related to trying to write to a nonexistant folder)

OmarAHex commented 5 months ago

Also, line 185 in downloadfolder.py does "file.name.replace(osp.sep, '')" but this is not sufficient for windows on windows, osp.sep is "\", but "/" is also a valid separator, which gdown ignores, I think this entire issue could be resolved by replacing that line of code with a proper sanitization function

wkentaro commented 4 months ago

@OmarAHex Can you give me an example so that I can reproduce?

OmarAHex commented 4 months ago

https://drive.google.com/drive/folders/12cQ4ltgbkBhltqylzSg7g0n7lcTV8zew