MalloyDelacroix / DownloaderForReddit

The Downloader for Reddit is a GUI application with some advanced features to extract and download submitted content from reddit.
GNU General Public License v3.0
498 stars 47 forks source link

Freeze while downloading #279

Open thany opened 2 years ago

thany commented 2 years ago

Describe the bug After letting it download for a few hours, the UI freezes to the point that Windows adds (Not Responding) into the title bar, dimming the main window. It really doesn't respond to anything, obviously.

However, the current download is still going. It's definitely downloading new files and adding them to disk. It's like only the GUI is frozen, and all the rest is fine. But it's also demanding 100% utilisation of one CPU core, so that's still not great.

I don't know exactly when this happened. I just found the program in this state after letting it do its thing for a couple of hours. So I think I'll let it carry on, as terminating the program will probably force every downloaded file to get downloaded again (it was like that in 2.x versions anyway - not sure if this has been fixed, or even if it can be fixed).

While typing this up, I also see the GUI "thawing" from time to time, only to freeze up almost immediately after. So it's not a permanent freeze. Something must be super busy. I wonder what part of the GUI needs that much processing power...

To illustrate: image

I don't think memory usage like that is normal either, given that it's running on a database (and assuming the database isn't entirely loaded into memory)...

Environment Information

To Reproduce (optional) Hard to say, as stated before. Let it download a long time.

For all but the most trivial of issues, please attach the latest log file. Yeah, can I send it privately please, if neccesary? There's a bit of naughtyness in there 😘

MalloyDelacroix commented 2 years ago

That kind of CPU and memory usage is definitely not normal. I can't imagine what would lead to that. The GUI runs on its own thread, so anything doing any substantial work should not affect it.

You can email the log privately to downloaderforreddit@gmail.com. I'm the only one who can access it there. I doubt that whatever is making this happen will show up in the log, but it couldn't hurt.

Closing the app shouldn't have an effect on any downloaded files except for ones that are currently in the process of downloading. Everything is saved to the database immediately and very little is kept in memory without at least being backed up in the database.

thany commented 2 years ago

Come to think, maybe it has to do with the way Windows handles resolution/display changes. Since I'm accessing the program on a remote VM on the home server over RDP (so I can let it run and shut down my computer), the remote changes resolution to make it fit the RDP session.

It might be possible that that's what is causing DFR to get confused in the GUI thread? Just speculating here, because I too wouldn't know how a GUI could work normally for hours and then just freeze up for no obvious reason, without even interacting with it.

Also, logs have been sent. It even correctly rotated logs, so there's also a .log.1 file which probably just contains more of the same.

thany commented 2 years ago

Update: I've updated to 3.13.2 and this time I've left the RDP session connected.

It's not freezing yet, even after a few hours of purring along. This doesn't mean it's solved though - it might still need something to get it properly fixed.

Regardless of RDP sessions and changing display settings, there's still something going on that might be weird. While downloads are getting along perfectly fine, I see the CPU time spiking every ~5 seconds to somewhere around 15% for the duration of probably a second (it's hard to guess short-lasting spikes using task manager πŸ€·β€β™‚οΈ). This 15% is across all three CPU cores, so it's close to 50% on a single core. That seems a bit much for a what it's doing.

It's also gradually building up memory usage. This might be indicative of a memory leak, or it might be by design (database cache). But it's at 1.2GB as I'm typing this, which does seem like a lot. Shortly after starting I've seen it at 500MB, a little later I've seen 700MB, and now this. Memory is of course there to be used, but a buildup like that is unusual.

And I know it's an unfair comparison, but you can see Total Commander sitting there using almost nothing from the memory, and that one's been running for days, sometimes weeks on end, without ever closing it.

zacker150 commented 2 years ago

Are you by any chance downloading a long video? The only thing that I can think of which would use up that much CPU and memory is FFMPEG combining a large video.

What do you see when you click on the arrow next to DFR in task manager?

thany commented 2 years ago

Are you by any chance downloading a long video?

No, small videos and images were getting saved one after the other, so it likely wasn't doing a very large download.

What do you see when you click on the arrow next to DFR in task manager?

Nothing interesting. Just another entry "Downloader for Reddit". Just the one, no ffmpeg.

Because leaving the session connected hasn't seemingly triggered this problem, really makes me suspect a change in resolution or other display settings, to cause the GUI to get blocked up.

thany commented 1 year ago

@MalloyDelacroix It's happening again on 3.14.1. No network I/O, no disk I/O, but fully blasting the CPU on all cores to within availability. It's not completely frozen as in "Not Responding" - the GUI is still responding as it should do (so at least the program appears to written well, using threads and whatall).

This is an overview (from ProcessHacker) of the offending threads that are blocking up the CPU: image

I hope this tells you more than it does me πŸ˜€

Here's another interesting tab: image

Look at the number of I/O bytes read and compare to the total time. That's just excessive. I'm not seeing the SSD going nuts, so I'm guessing it's some sort of internal I/O going on. Maybe memory access or something. It's still a lot. And 6.7 trillion CPU cycles, for a download pogram, am I reading correctly? Wow πŸ˜€

The number of handles appear to be ever decreasing, which seems strange - what will happen when it hits 0, and why did it get so high in the first place?

Edit: I killed the program in hopes it wouldn't break anything. What else am I going to do from my end? But after starting it back up again, and starting a download, it goes right back to being stuck in the exact same way. It immediately manages 700~900MB/s I/O rate, presumably until I kill it again, to whatever device that can handle it (SSD is still basically idle).

thany commented 1 year ago

It unstuck itself after a good long while. I left it by itself, so I can't see how long it took. Too long, either way πŸ˜€

The questions that remain: how could this have happened, and what options do we have to prevent this?

thany commented 1 year ago

@MalloyDelacroix So I ran into this problem again. Any ideas by now, what could be causing this?

thany commented 1 year ago

I can have a look at the database view if that might tell what the hecko is making it so slow, but you'll then have to tell me exactly what to click on, because the database view really feels like a hastily implemented support tool :)

By the by, I just noticed this: image

This tells me (correct me if I'm wrong) that it constantly opens and closes handles to the database file, and therefor constantly opens and closes the database - maybe even from multiple threads as well. This might indicate why it's so slow, even though it isn't (or shouldn't be) deleting a massive amount of records.

thany commented 1 year ago

One more update: I's been sitting there, downloading nothing since the last literal 12 hours!

All it's been doing, is using up CPU cycles, about a GB of memory, and tens of MB/s of READ on my C-drive. Downloads are supposed to be written to my H-drive - only the database and the program are on the C-drive.

So I decided to clear out the content and post tables using a SQlite tool. Things are much faster now. Seems like this program needs a function to clean up its own database... SQlite databases shouldn't be allowed to grow forever, it's not a database library suitable for that. Proper database servers are more suitable, but of course that doesn't fit in a standalone program, so... I dunno what the best option is in this case.

Maybe you could start by putting some indices on the tables, on the fields you need to access. Perhaps you were looping through my 800,000+ posts table "by hand" for each iteration of the download process, or something. Indices help. Maybe then it can be allowed to grow a bit bigger before it grinds to a halt once more.

MalloyDelacroix commented 1 year ago

I would not have thought that database access was the issue slowing it down this much. It sounds like I need to do a deep dive into making the database operate more efficiently and make a database cleanup module when I have the time to do so again.

Thanks for the testing and information. This will give me a solid direction to head in.

thany commented 1 year ago

No problem. Feel free to provide a test version if you're comfortable to do so, because I've kept my "slow database" around for future testing.

thany commented 1 year ago

@MalloyDelacroix Maybe this will help you along. These are the queries I need to execute to get the program "going" again:

delete from content;
delete from post;
delete from subreddit where id in (select id from reddit_object where new=1);
delete from user where id in (select id from reddit_object where new=1);
delete from reddit_object where new=1;

Now, I'm not 100% about the new=1 - this was seemingly the way to select whether an object is in one of the users or subreddits lists that is visible to the end user. I couldn't any other reliabl way to determine this.

So essentially this deletes anything that doesn't need to be kept around purely for leeching.

Another thing that was really interesting, is when I reverse the first two queries, it takes absolutely forever. I let it sit for half an hour or so and then killed my sqlite tool. The above order makes it delete the same records in under a second. I don't know why this is, but hopefully this gives you some insight on why DFR is so slow, perhaps it tries to do stuff in a way that is massively slow in sqlite, where doing the same thing in a different order might be a lot snappier.

One other thing that stands out, possible completely unrelated to the issue at hand, now that it's not constantly blasting the CPU at full horsepowers anymore, is that it sits completely idle sometimes for seonds on end. No CPU, disk I/O, network I/O at all. Just waiting, I guess, but waiting for what? Please note that I'm on 1Gbps fiberoptic, so I hope it's not waiting for anything remote.

thany commented 1 year ago

Not only is this still happening, I've noticed a different kind of freeze.

I'm also seeing the program just waiting around. No notable CPU utilisation, no significant I/O activity, and nothing on the network. What's it doing? It appears to be waiting for something. But even "debug" logging does not reveal what it's actually currently doing.

And then when I stop the download, the whole program closes. Is that a crash? I don't see any errors... I wish this program was a "set and forget" kind of deal, but you really have to hold its hand. Stopping it and restarting it, cleaning out the database, over and over and over. And then when I start it right back up, it's purring along like a kitten as if nothing ever went south.

I don't get it. Something must be wrong buried in there somewhere.

MalloyDelacroix commented 1 year ago

I assume the app's database operations might be a weak point. Judging by the database file sizes some users have reported, I believe I way underestimated the number of downloads that users would be performing and the amount of data that would be stored in the database. I did not prioritize database efficiency enough because I didn't think it would ever be a problem. I was wrong.

I hope to fix this in the future when I have time to do a massive update.

There aren't any spots in the app that should hang for a significant amount of time. The longest downtime you should ever see is if you are extracting a large amount of content from reddit. Individual extracts are not reported to users, so this can look like the app is frozen. But this should not last as long as you are reporting.

If you have millions of content urls stored in your database, you may see better results if you disable the duplicate check. I suspect this query may be responsible for some delays in very large databases.

thany commented 1 year ago

I did not prioritize database efficiency enough because I didn't think it would ever be a problem. I was wrong.

I wouldn't say that. Sqlite is a solid framework, just not built to withstand very large amounts of records as well as a true RDBMS does. Maybe a different system would've been better, maybe Sqlite could've been set up better. But I'm assuming at the time you worked with the information you had, and I trust you made the best choice based on that. Things can change, and perhaps now a different setup makes more sense. Don't blame yourself, is what I'm saying.

I hope to fix this in the future when I have time to do a massive update.

That's okay. This is a hobby project after all, and I'm not demanding anything, so do take your time. I can cope in the mean time.

If I knew python I would've looked into it as well, but I'm totally clueless in python πŸ€·πŸ»β€β™‚οΈ - I'm okay with Node.js, but that ain't helping you, is it πŸ˜€ However, if you need me to try something out, or need to know anything, feel free to ask away.

you may see better results if you disable the duplicate check

The duplicate check is a viable thing to try. I've disabled it right away. Let's see how that goes.