Serene-Arc / bulk-downloader-for-reddit

Downloads and archives content from reddit
https://pypi.org/project/bdfr
GNU General Public License v3.0
2.32k stars 213 forks source link

[BUG] OSError: File name too long #966

Closed eagerm closed 3 months ago

eagerm commented 3 months ago

Description

Some downloads have very long file names. When checking to see if the file has already been downloaded, this causes stat() to fail, generating an exception, which is not caught.

Command

Environment (please complete the following information)

Logs

Not available

Patch

downloader.py: 112

           try:
                if destination.exists():
                    logger.debug(f"File {destination} from submission {submission.id} already exists, continuing")
                    continue
                elif not self.download_filter.check_resource(res):
                    logger.debug(f"Download filter removed {submission.id} file with URL {submission.url}")
                    continue
            except OSError as e:
                logger.error(
                    f"Failed existence check {submission.id}: {e}")
                continue
Serene-Arc commented 3 months ago

Hi, is there a reason you haven't added logs?

eagerm commented 3 months ago

I didn't capture logs for the failing case.

Serene-Arc commented 3 months ago

There will always be logs. The BDFR writes them to a file in your configuration directory.

eagerm commented 3 months ago

I fixed the failure before I looked at the bug reporting process which requested logs. I did not go back and remove the fix and rerun the failing execution.

Serene-Arc commented 3 months ago

It's more complicated than that. We have code already that is supposed to detect the file system and then truncate the file names according to the system limits. If it's failing here, there's a deeper reason than your patch. If files are being written in a manner that causes them to later fail when being read, that's a problem.

Can you provide more details on your setup? Are you using any kind of network share? Windows or Linux?

eagerm commented 3 months ago

Fedora 40 Linux, local file system.

I'll back out my fix and see if I can reproduce the problem.

eagerm commented 3 months ago

log_output.txt