Open Lwxiang opened 1 year ago
Do you know which object is leaking?
I haven't found the leaking object, besides torrent_file(), call handle.status() from read_piece_alert also causes leaking.
@Lwxiang How do you interpret "memory leak"?
I haven't found the leaking object, besides torrent_file(), call handle.status() from read_piece_alert also causes leaking.
So each call to torrent_file()
within the loop allocates some additional memory and never frees it? So if you (for example) will purposely skip this call at some iterations it allocates less memory?
@glassez
So each call to torrent_file() within the loop allocates some additional memory and never frees it?
yes
So if you (for example) will purposely skip this call at some iterations it allocates less memory?
With the test code above, if I add a 50GB file, which has 12800 pieces, 4096KB per piece, then call torrent_file() about 2000 times(imprecisely) cause the memory usage at about 20.4GB RES(after all piece is finished and there are not read alerts any more and the memory never be released)
while if I call torrent_file() about 8000 times cause the memory usage at 37.1GB RES
the calling times is not very precise, but you can see more torrent_file() I call, more memories it will remain after all actions.
you can see more torrent_file() I call, more memories it will remain after all actions.
It looks strange, considering that torrent_handle::torrent_file() isn't supposed to make copy of the data, but only returns a shared pointer to existing one.
With the test code above, if I add a 50GB file, which has 12800 pieces, 4096KB per piece, then call torrent_file() about 2000 times(imprecisely) cause the memory usage at about 20.4GB RES
Could you provide a modified code (with limiting calls to torrent_file()
) that you used to test it?
Could you provide a modified code (with limiting calls to torrent_file()) that you used to test it?
@glassez
the limit number can be changed by max_count
#include <atomic>
#include <functional>
#include <future>
#include <iostream>
#include <memory>
#include <mutex>
// lib torrent
#include "libtorrent/alert_types.hpp"
#include "libtorrent/create_torrent.hpp"
#include "libtorrent/session.hpp"
#include "libtorrent/settings_pack.hpp"
#include "libtorrent/socket.hpp"
#include "libtorrent/torrent_flags.hpp"
#include "libtorrent/torrent_handle.hpp"
#include "libtorrent/torrent_info.hpp"
#include "libtorrent/torrent_status.hpp"
const lt::alert_category_t alertMask = lt::alert_category::status |
lt::alert_category::file_progress |
lt::alert_category::piece_progress |
lt::alert_category::incoming_request |
lt::alert_category::performance_warning |
lt::alert_category::tracker |
lt::alert_category::connect |
lt::alert_category::peer;
int count = 0;
int max_count = 2000;
std::mutex mutex;
void alert_handler(lt::session* s)
{
while (true)
{
std::vector<lt::alert*> alerts;
s->pop_alerts(&alerts);
for (const auto alert : alerts)
{
switch (alert->type())
{
case lt::add_torrent_alert::alert_type:
{
std::cout << "torrent added" << std::endl;
break;
}
case lt::read_piece_alert::alert_type:
{
mutex.lock();
int current = ++count;
mutex.unlock();
if (current >= max_count)
{
break;
}
lt::alert_cast<lt::read_piece_alert>(alert)->handle.torrent_file();
break;
}
case lt::piece_finished_alert::alert_type:
{
lt::piece_finished_alert* a = lt::alert_cast<lt::piece_finished_alert>(alert);
a->handle.read_piece(a->piece_index);
break;
}
default:
break;
}
}
usleep(100000);
}
}
int main()
{
lt::settings_pack sp;
sp.set_str(lt::settings_pack::listen_interfaces, "127.0.0.1:60020");
sp.set_int(lt::settings_pack::alert_mask, alertMask);
sp.set_int(lt::settings_pack::aio_threads, 16);
sp.set_int(lt::settings_pack::alert_queue_size, 10000);
auto s = lt::session(sp);
sleep(1);
auto f = std::async(std::launch::async, std::bind(&alert_handler, &s));
sleep(1);
lt::file_storage fs;
lt::add_files(fs, "./testfile");
lt::create_torrent torrent(fs);
lt::set_piece_hashes(torrent, lt::parent_path("."));
std::vector<char> tmpBuf;
lt::bencode(std::back_inserter(tmpBuf), torrent.generate());
lt::add_torrent_params p;
p.ti = std::make_shared<lt::torrent_info>(tmpBuf.data(), tmpBuf.size());
p.save_path = ".";
p.flags = lt::torrent_flags::update_subscribe |
lt::torrent_flags::need_save_resume;
auto handle = s.add_torrent(p);
sleep(1);
// wait for the user to end
char a;
int ret = std::scanf("%c\n", &a);
(void)ret; // ignore
return 0;
}
I tested this a few more times and found that the amount of memory leaked does not necessarily seem to be positively correlated with the number of calls. Maybe there are something wrong in my test code. But it is true that there is some memory that cannot be released after calling.
Maybe it's read_piece()
that causes the problem?..
I strongly suspect that the memory being used is because of the (unbounded) number of calls to read_piece()
. That call will cause libtorrent to allocate the size of one piece on the heap, copy that piece data into it and pass it back to you in an alert. Since you don't limit the number of simultaneous read_piece_alerts
in the queue (i.e. it's unbounded) it's all a race to destruct those alert objects and free the piece memory. The faster you can free the alerts, the fewer of them will exist simultaneously.
i.e. you could most likely put any code there (e.g. sleeping for 1ms) and you'll see the memory usage increase.
Since you don't limit the number of simultaneous
read_piece_alerts
in the queue (i.e. it's unbounded) it's all a race to destruct those alert objects and free the piece memory. The faster you can free the alerts, the fewer of them will exist simultaneously.
But doesn't @Lwxiang claim that the allocated memory remains unallocated even after all read_piece_alerts
are processed (and supposed to be freed)? Or did I misinterpret it?
But doesn't @Lwxiang claim that the allocated memory remains unallocated even after all read_piece_alerts are processed (and supposed to be freed)? Or did I misinterpret it?
yes, if the memories are just allocated and will be freed later, it is fine. But after all read_piece_alerts
processed, the memories remain and never be freed.
This is not the memory usage during processing, it is after all read_piece_alerts
processed, and wait about half an hour.
I am pretty sure that the memory leaks, but I don't know why.
can you build with address sanitizer?
can you build with address sanitizer?
I have tried, no leak error shows
i.e. you could most likely put any code there (e.g. sleeping for 1ms) and you'll see the memory usage increase.
YES, I have tried as you said.
The point is not torrent_file()
or status()
, the point is the processed time. the longer processed time it takes, the more memory will exist simultaneously. So if I just usleep(10000)
can also reproduce the case.
So maybe the unbound number of read_piece_alerts sending into queue prevent the old alerts from releasing?
If I call torrent_file()
or other high-cost procedures in read_piece_alerts handler, it may cause that the current generation alerts have a longer life(before clean) than before. Which also causes the next generation alerts have more read_alerts(same speed of incoming alerts), then the next generation have more alive time.
as the following pic:
The memory usage increase higher and higher during processing because the alerts have a longer and longer life time. But notice the last generation, only takes 2 seconds to process 5k, this because the file checking finished and torrent status changes to seeding.
so the questions become to followings:
- why file checking will slow down the process speed of torrent_file() and other procedures?
libtorrent doesn't use mutex-based approach to obtain some data from the working thread but it queues "requests" to be invoked in that thread so you need waiting until all current jobs are done.
For now, I just forbid the read-piece actions when torrent is under checking-files state to prevent the case.
@Lwxiang it seems to me that the problem with unbounded memory usage is fundamentally caused by you not having a limit on the number of pieces you keep in RAM at a time.
Is there a reason you cannot have a cap on that? It doesn't seem very robust to assume that you can handle alerts at a high enough rate to stay within some reasonable memory usage. It might work under the right circumstances on your computer, but perhaps not on other's computers.
I see, the more number of pieces I keep in RAM at a time, the more memory usage it takes at a time, that is right. What I don't understand is why, after all read_pieces_alert is cleaned, the memory usage is still high, and never come down. (I don't keep any data of the piece buffer)
What I suppose to see is, the memory usage goes up, maybe very high, but finally it will fall down after all actions done.
But what I actually see is, if I call multiple read_piece during checking-files state, the memory is like following, even after I remove the torrent.
understanding that requires to first understand what you mean when you say "memory usage". There are at least 3 levels involved:
malloc()
and free()
malloc()
and free()
I expect that you look at resident pages, which is in the realm of (3) whereas libtorrent is in (1). Perhaps the run-time library is predicting that the process may have another spike in memory allocations any moment, and defers returning parts of its address space to the kernel. Or perhaps it ended up being fragmented making it difficult for the runtime library to return it.
In order to know whose behavior you need to understand, it would make sense to also profile the memory usage at the malloc()
and free()
level. With a heap profiler for instance.
note that the alerts from the last pop_alerts
are not freed until you call pop_alerts()
again.
I try to look into this problem again recently,
I found that, if I do lots of read_piece during file-checking, the size of read_lru1
list in block_cache will be pretty high, and will not be released after all actions done.
I think maybe the maybe_free_piece
skip some evict actions, look the functions following, we skip the evict if pe->marked_for_eviction
is false.
but I think it should be evicted with allow_ghost
? the following call use the marked_for_eviction
to decide the params of evict_piece
. maybe we should delete the line || !pe->marked_for_eviction
?
bool block_cache::maybe_free_piece(cached_piece_entry* pe)
{
if (!pe->ok_to_evict()
|| !pe->marked_for_eviction
|| !pe->jobs.empty())
return false;
DLOG(stderr, "[%p] block_cache maybe_free_piece "
"piece: %d refcount: %d marked_for_eviction: %d\n"
, static_cast<void*>(this)
, int(pe->piece), int(pe->refcount), int(pe->marked_for_eviction));
tailqueue<disk_io_job> jobs;
bool removed = evict_piece(pe, jobs
, pe->marked_for_deletion ? disallow_ghost : allow_ghost);
TORRENT_UNUSED(removed); // suppress warning
TORRENT_PIECE_ASSERT(removed, pe);
TORRENT_PIECE_ASSERT(jobs.empty(), pe);
return true;
}
by the way, is there any block cache in version 2.0.x? I can not find it.
I found that, if I do lots of read_piece during file-checking, the size of read_lru1 list in block_cache will be pretty high, and will not be released after all actions done.
It's only released if the cache size is near its max size. is "pretty high" greater than your cache size?
It seems pretty obvious that the memory is used up by read_piece_alerts
, holding a buffer for a whole piece. Can you confirm that that's not the case?
by the way, is there any block cache in version 2.0.x? I can not find it.
No, there's not. it uses the page cache (via memory mapped files) instead.
two parts, read_piece_alerts
holds some memory, and the block_cache holds another part.
according to the code, it seems only when move piece to the ghost list has the max_size, the lru1 list has not.
the || !pe->marked_for_eviction
condition makes the following call evict_piece
will never get allow_ghost
I monitor the cache list size during test.
by default the list size is 32, but the lru1 size increases to thousands and not be released after all actions done.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I monitor the cache list size during test.
by default the list size is 32, but the lru1 size increases to thousands and not be released after all actions done.
This looks like this: https://github.com/arvidn/libtorrent/commit/e5fe95f92273bd775961e2f1f3b4ec45f8012d9e
Please test.
libtorrent version (or branch):
1.2.18
platform/architecture:
x86_64 centos7.2
compiler and compiler version:
gcc7.3.1
I add big file into lt session, and the file checking will last for a while. During the file checking, if I call torrent_info() from read_piece_alert's handle, there will be some memory leak.
With the following code can easily reproduce the case:
This code is just:
./testfile
, add it into session.I will create a big file as
testfile
, such as:then just run the test code, the memory leaks occur. But after file checking, the memory not leak any more when I keep calling read_piece().