MalloyDelacroix / DownloaderForReddit

The Downloader for Reddit is a GUI application with some advanced features to extract and download submitted content from reddit.
GNU General Public License v3.0
505 stars 47 forks source link

fix: get fewer broken downloads by not chunking #125

Closed crccheck closed 4 years ago

crccheck commented 4 years ago

Every guide ever says you should stream and download in chunks, but I found that the simplest way ended up with fewer broken downloads.

zacker150 commented 4 years ago

You're supposed to download in chunks because files can be arbitrarily large. The naïve way loads the entire file into RAM. What happens if a user with 8GB of RAM in their system attempts to download a 10 gig movie?

crccheck commented 4 years ago

Yup, chunking is best practice, but I've been testing this change enough that I thought it was worth sharing if others are having lots of trouble with broken files too. I'll update the PR description with why this is just a WIP

MalloyDelacroix commented 4 years ago

How often are you experiencing broken downloads because of this? I was not aware of this issue.

In my testing, larger chunk sizes led to much slower downloads (with 1kb being the smallest chunk tested). Non-streamed downloads were by far the worst performers. If smaller chunks are causing downloads to break, maybe we can find a happy medium where files download correctly and download speed is maintained.

Of course, it's likely that this is dependent on the host machine and internet connection. So it may be better if we make this user adjustable so the user can fine tune this to their machine.

zacker150 commented 4 years ago

Can you give me some example URLs to files that break when downloading via chunks?

crccheck commented 4 years ago

It's random. I see corruption for all sorts of sizes, even "small" 100KB files. At first, I thought it was an incomplete download, but I tried comparing content-length vs file size and they're the right size.

crccheck commented 4 years ago

will re-visit after Database backend #126