arvidn / libtorrent

an efficient feature complete C++ bittorrent implementation
http://libtorrent.org
Other
5.26k stars 996 forks source link

Memory leaks in 1.1 #456

Closed ruslanfedoseenko closed 8 years ago

ruslanfedoseenko commented 8 years ago

I tried to use 1.1 brunch in my Torrent Client and found that memory leaks exists. (leaks about 1-2 mbps on active download. After download finished about 200-600 kbps) I use windows. Visual Studio 2013, boost 1.58 Compiled libtorrent with: bjam boost-link=static dht=on encryption=on mutable-torrents=on crypto=openssl link=static runtime-link=shared variant=release character-set=unicode debug-symbols=on i2p=on include="Z:\OpenSSL\include" library-path="Z:\OpenSSL\lib\VC\static" install To detect memory leaks I use Visual Leak Detector https://vld.codeplex.com/

Here is the results http://62.75.159.32/memleak.txt

arvidn commented 8 years ago

could you define exactly what you mean by "leak"? from what I can tell of your log, you don't destruct the session, so everything in the session appears to be considered a "leak". do you not have a reference to the session at the time of this snapshot? I'm trying to reproduce having anything leaked in client_test (pretty plain run of it, downloading some torrents). Can you reproduce these leaks in client_test? (also, collating leaks by stack trace would be a more user-friendly format to analyze I think)

ruslanfedoseenko commented 8 years ago

I've tested client_test. What I've seen. During download memory constantly grows (after downloading during 4 minutes memory usage is 70 mb/78 mb private set memory). After download finished some memory was freed to 45 mb(54 mb private set bytes). After that client_test start seeding a torrent and memory constantly grows 200-400 kB. in 10 minutes was increased from 45 to 97 mb.(105 mb private set bytes) I run it with this command: client_test -z -N -h -H -M -l 2000 -S 1000 -T 1000 -c 1000 test.torrent OS is win10. Total file size downloaded by torrent is 1.3 GB I don't run it with memleak profiling tool.

Here is a screen before i shutdown client_test image

arvidn commented 8 years ago

there's a good chance the memory usage you see is windows' disk cache. With that in mind, I think it makes sense to think about what the actual problem is. Do you have some other process running on your system that get evicted in order for windows' to cache disk I/O? If not, you may want to experiment to see if windows would actually prioritize that memory usage over some other process'.

One way to find out would be to disable the actual writing to disk in libtorrent. If you pass in -0 (that's a zero, not capital o) libtorrent will pretend to read and write data from and to disk, but it will really just discard it and produce garbage. For testing downloading that's fine, for testing seeding it's a bit harder, you would need a client that can disable hash checking. (client_test can do that with -z)

arvidn commented 8 years ago

could you please run that client_test experiment under the visual studio memory profiler (with debug symbols and ideally a debug build)? It should pinpoint where most of the memory is allocated. We would expect about 16 MB to be the disk cache, the profiler should tell you where the other memory comes from.

arvidn commented 8 years ago

I've tried to reproduce this on windows with a similar setup (as far as I know at least). this is with an 32 bit msvc-14 build on windows 7. I will try a 64 bit build as well to see if it makes a difference. So far I have not been able to reproduce it (most of my tests have been on OSX so far though, but I've moved over to windows as it may be OS specific). Any other tips on configurations or options I should try are welcome.

libtorrent-1 1-leak-attempt

ghost commented 8 years ago

Have you been able to reproduce it with my steps? This build: http://builds.shiki.hu/temp/qbittorrent_3.4.0alpha_libtorrent_1.1_git_%20c1f7446c26881.7z

capture4 capture5

But... maybe the problem is related to Qbit implementation not to libtorrent itself.

arvidn commented 8 years ago

I believe the logic for calculating the disk cache size for "auto" mode may have changed from 1.0.x and 1.1. Is it possible that the fact that you set it to 0 (auto) just makes the cache size a lot larger than you expect? does qbt tell you how big the cache is? ruslanfedoseenko reproduced it in client_test though, which makes me think it's libtorrent itself

ghost commented 8 years ago

Already thought about that before ;)

And tried 1500 MB Same problem, mem usage goes over 1.8 GB and crash.

arvidn commented 8 years ago

I ran the same experiment with a 64 bit build with no significant heap size increase. It is possible there's some specific setting that causes it. I'll run another test with the client test arguments ruslanfedoseenko suggested

arvidn commented 8 years ago

@ruslanfedoseenko @Isabelxxx which compiler did you compile libtorrent with? visual studio 2015 community is a free download and includes the heap profiler. It's quite finicky to get to work, I only got the one that starts automatically when debugging to work (you have to enable it at start though, and explicitly take a snapshot once the cache is full, and then another snapshot when there's significantly more memory usage)

ruslanfedoseenko commented 8 years ago

I use Visual Studio 2013 Tried different verisons of Boost (1.60, 1.58) In my client i set cache_size = 32MB (real value in settings 2048)

ruslanfedoseenko commented 8 years ago

My considereation that you may have cross-refrences in code. What i mean shared_ptr A shared_ptr B A->someField = B B->someField = A As i know in this case boost::sahred_ptr could not free memeory. I'll try later with VS 2015

ghost commented 8 years ago

https://qbforums.shiki.hu/index.php/topic,4113.msg21284.html#msg21284

ghost commented 8 years ago

Maybe that helps. It seems the AUTO setting in qbit leads to unlimited usage in last builds using 1.1

That was not the behavior in previous builds; auto was never unlimited, only (big) conservative usage or RAM but never leading to crash, there was a (hard)limit. Maybe you changed something there? Or qbit needs some new tweaks to set the config right with 1.1?

That's why it appears "only" in qbit, since it's reproducible with every new build with 1.1. As you suggested something has changed in the cache settings. Auto was not working that way before and it needs to be addressed since qbit uses auto by default.

arvidn commented 8 years ago

@Isabelxxx I see. I'll take a look at that. however, ruslan reproduced this with client_test, where it clearly says it has only allocated 16 MiB, so there may be something else too. I get the impression you see the cache grow indefinitely then? (or until the 32 bit address space is exhausted)

ghost commented 8 years ago

Yes, the cache grows indefinitely. Not untiñ 32 bit limit, it tries as much as the torrent sizes, no limit.

The "strange" thing to me is rechecking is also using the write cache. I mean, setting to auto leads to ram usage = torrent size being rechecked. If I set it manually to lets say 100 Mb then it uses that memory for rechecking. I don't see why write cache is related to check something already written at all.

You have to also consider the case where you set cache to 500 Mb for ex and are downloading or rechecking 5 torrents bigger than 500 Mb at the same time. That could lead to crash at some point since the cache size is not global but file dependent. So it's additive and without 32 bit limits.

All this config strange behaviors were nonexistent in previous builds, cache was not behaving that way, not used for rechecking and for sure auto had a limit.

PD: you can see the rechecking connection to write cache in the post where I set 700 Mb for the cache. https://qbforums.shiki.hu/index.php/topic,4113.msg21284.html#msg21284

See the images.

arvidn commented 8 years ago

the disk cache is used to cache both reads and writes. If you'd like it to only cache writes you can disable the read cache (that's a separate option). If your upload rate is I/O bound, that will negatively impact your upload performance though. when checking a torrent, the blocks pulled into the cache are specifically marked as volatile, making them lower priority and the first ones to be replaced.

The cache size is definitely a global setting. There's only one cache shared by all torrents. If you're seeing that limit be exceeded, that's a bug. I'd be interested in knowing how you make that happen.

Leaving pieces read by a recheck in the cache is a feature, it makes the data available for high performance seeding immediately after. The alternative is to throw it away and potentially re-read it once you start seeding. It doesn't cost you anything, assuming your cache size limit works and you set it deliberately, not using that RAM at all would be a waste of resources. If you don't like having pieces lying around in the cache, you can set an expiration time where they're evicted. However, the blocks are still allocated in a pool allocator, which means it probably won't be returned to the runtime anyway. Even if it was returned to the runtime, it would likely not return it to the OS (by unmapping that space). (the best option there may be something like madvice(MADV_DONTNEED)).

I don't see any images in that forum post.

Just to be clear btw. The intended behavior for disk cache "auto" is to look at the amount of physical RAM in your machine, and set the effective cache size to an eighths of that. If on a 32 bit system, it's capped to 3/4ths of 2GB.

You can see the actual logic here: https://github.com/arvidn/libtorrent/blob/master/src/disk_buffer_pool.cpp#L479

Actually, looking at this, are you on a 32 bit system by any chance? it looks like there may be an overflow in the 32 bit calculation.

arvidn commented 8 years ago

I've found a few places that breaks support for auto disk cache size fixed in this patch: https://github.com/arvidn/libtorrent/pull/523

I believe there are still remaining issues that I'm looking into.

ghost commented 8 years ago

I'm, on 64 bit system with 12 Gb of Ram.

And the qbit build is 32 bits of course.

There is no option for read cache in the program; just Write cache Size (it's named that way, so I could not expect that to be a global cache size) and enable/disable OS cache.

ghost commented 8 years ago

The images are at bottom, see attachments.

arvidn commented 8 years ago

I pushed another update to #523 . I'll keep disk cache related work in this branch. I have not been able to reproduce the ever-growing disk cache problem, but I did find a few edge cases that could potentially cause the limit to be exceeded (but not indefinitely).

ghost commented 8 years ago

Ok arvidn let me help a bit more. What do you need me to do and install to check what happens with the cache? I have been using lately this build: https://github.com/arvidn/libtorrent/pull/508#issuecomment-192535833

But can compile a new one if you need it, just tell me what to do. Have the old 2008 visual studio professional I used in the past; or do I use the visual studio 2015?

arvidn commented 8 years ago

visual studio 2008 should be sufficient (that's 9.0 right?). I regularly build with msvc-9.0.

I would really appreciate it if you could build libtorrent from #523 and use that for your qbt. I'm not all that familiar with the build process of qbt, whether it includes building libtorrent or not. To just build libtorrent though you could use bjam. I would recommend downloading boost 1.60 and build bjam and configure boost-build.

the specifics are done in the appveyor.yml script in the libtorrent root. (appveyor hosts come with boost, but building bjam, configuring it and building libtorrent is done by the appveyor script)

ghost commented 8 years ago

According to this, yes, I have to compile libtorrent first. https://github.com/qbittorrent/qBittorrent/wiki/Compiling-with-MSVC-2013-%28static-linkage%29

But it's a bit outdated seeing some comments in the forum, it could take some time to get all right... years without using MSVC, sorry!

Chocobo1 commented 8 years ago

@Isabelxxx If you're compiling qbt, better use MSVC2015.

arvidn commented 8 years ago

as the disk-cache-1.1 branch has landed in RC_1_1 now, I consider this solved. Please let me know if an issue like this re-appears.

arvidn commented 8 years ago

@Isabelxxx You may want to check out https://github.com/arvidn/libtorrent/pull/529. It's a patch to maintain a smaller disk cache footprint for checking torrents by actively evicting volatile pieces while checking.

ghost commented 8 years ago

Has been a bit difficult to get the build but can confirm the patch works with the last alpha Qbit release source code and your commit.

Set again Cache to auto, tried to recheck 2GB torrent and now no crazy memory usage.

ghost commented 8 years ago

Have to test the torrent download memory usage but Rechecking is now fixed.

EDIT: Memory usage is not growing indefinitely when downloading multiple torrents or big torrent now. RAM max out at ~600 Mb for me after some time independently of the torrent size, not like before growing proportionally to the download size. As you said 3/4 of 2 Gb for the 32 bit build.

All memory problems reported seem to be fixed now.

capture3

ghost commented 8 years ago

Btw I have a question... I see RAM increases while downloading until it reaches certain limit in auto, that's ok. When the download finishes that Ram is not being freed, even if I stop sharing the torrent. I can even delete it: neither the mem commit size nor the private working set decrease. Does that mean the cache size is being increased dynamically according to how much is needed but it's not being resized when not in use too? I suppose you free the cache itself for new files (Disc Cache expiry interval), but the total available cache (size) remains aprox. fixed as soon it reaches the limit, even if its not being used at all (?) Btw I have a question... I see RAM increases while downloading until it reaches certain limit in auto, that's ok. When the download finishes that Ram is not being freed, even if I stop sharing the torrent. I can even delete it: neither the mem commit size nor the private working set decrease. Does that mean the cache size is being increased dynamically according to how much is needed but it's not being resized when not in use too? I suppose you free the cache itself for new files (Disc Cache expiry interval), but the total available cache (size) remains aprox. fixed as soon it reaches the limit, even if its not being used at all (?)