snowyu / libtorrent

Automatically exported from code.google.com/p/libtorrent
Other
1 stars 0 forks source link

rb_libtorrent asserts after 2-5 hours of running deluged #110

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Running deluge for 2-5 hours

What version of the product are you using? On what operating system?

Running an up-to-date (as of a few days ago) Gentoo system x86. Deluge version 
1.3.0_rc2. Router DIR-655.

Originally I was getting a segfault with rb_libtorrent 0.15.1. Then I tried 
0.15.2, same thing. Then I tired 0.15.2 with debug info.

Here's a trace of the assert I get when running with debug info:

version: 0.15.2.0
$Rev: 4758 $
file: 'bt_peer_connection.cpp'
line: 1603
function: bool libtorrent::bt_peer_connection::dispatch_message(int)
expression: false
stack:
1: assert_fail(char const*, int, char const*, char const*)
2: libtorrent::bt_peer_connection::dispatch_message(int)
3: libtorrent::bt_peer_connection::on_receive(boost::system::error_code const&, 
unsigned int)
4: 
libtorrent::peer_connection::on_receive_data_nolock(boost::system::error_code 
const&, unsigned int)
5: libtorrent::peer_connection::on_receive_data(boost::system::error_code 
const&, unsigned int)
6: 
boost::asio::detail::handler_queue::handler_wrapper<boost::asio::detail::binder2
<libtorrent::peer_connection::allocating_handler<boost::_bi::bind_t<void,
boost::_mfi::mf2<void, libtorrent::peer_connection, boost::system::error_code 
const&, unsigned int>, boost::_bi::list3<boost::_bi::value<boost::intrusive_ptr
<libtorrent::peer_connection> >, boost::arg<1>, boost::arg<2> > >, 256u>, 
boost::system::error_code, unsigned int> 
>::do_call(boost::asio::detail::handler_qu
eue::handler*)
7: 
boost::asio::detail::task_io_service<boost::asio::detail::epoll_reactor<false> 
>::run(boost::system::error_code&)
8: libtorrent::aux::session_impl::operator()()
9: 
boost::detail::thread_data<boost::reference_wrapper<libtorrent::aux::session_imp
l> >::run()
10: thread_proxy
11:
12: clone

Started happening when I got my new router, the DIR-655. Perhaps it's something 
to do with my new network hardware/configuration?

If you need more info just ask.

Patrik

Original issue reported on code.google.com by Gornic...@gmail.com on 12 Sep 2010 at 3:50

GoogleCodeExporter commented 9 years ago
There are numerous asserts in that function. Could you run in gdb, and once it 
hits this assert type this on the gdb prompt:

(gdb) bt

It will print a similar stack trace, but with line numbers.
To run deluge in gdb, type:

gdb --args python deluge

(and possibly some arguments after that)

Original comment by arvid.no...@gmail.com on 14 Sep 2010 at 5:11

GoogleCodeExporter commented 9 years ago
actually, at a second look in the 0.15.2 branch, the line number for that 
assert suggests that the error is that a peer is sending an unknown message ID. 
This probably means that the bittorrent encryption failed in some way, either 
the other peer or libtorrent has a bug in the encryption. In release builds, 
this does not trigger a crash, it just disconnects the peer.

In order to find the crash in release builds, simply comment this assert out 
and keep running it in debug mode.

Original comment by arvid.no...@gmail.com on 22 Sep 2010 at 5:16

GoogleCodeExporter commented 9 years ago
OK, sorry for the late reply things have been quite busy.

After I got that assert, I started it again and got the following different 
assert. Not sure if this one will be related to the crash either ...

version: 0.15.2.0
$Rev: 4758 $
file: 'peer_connection.cpp'
line: 4881
function: void libtorrent::peer_connection::check_invariant() const
expression: picker_count == count
stack:
1: assert_fail(char const*, int, char const*, char const*)
2: libtorrent::peer_connection::check_invariant() const
3: libtorrent::invariant_checker_impl<libtorrent::peer_connection> 
libtorrent::make_invariant_checker<libtorrent::peer_connection>(libtorrent::peer
_connection const&)
4: 
libtorrent::peer_connection::on_receive_data_nolock(boost::system::error_code 
const&, unsigned int)
5: libtorrent::peer_connection::on_receive_data(boost::system::error_code 
const&, unsigned int)
6: 
boost::asio::detail::handler_queue::handler_wrapper<boost::asio::detail::binder2
<libtorrent::peer_connection::allocating_handler<boost::_bi::bind_t<void, 
boost::_mfi::mf2<void, libtorrent::peer_connection, boost::system::error_code 
const&, unsigned int>, 
boost::_bi::list3<boost::_bi::value<boost::intrusive_ptr<libtorrent::peer_connec
tion> >, boost::arg<1>, boost::arg<2> > >, 256u>, boost::system::error_code, 
unsigned int> >::do_call(boost::asio::detail::handler_queue::handler*)
7: 
boost::asio::detail::task_io_service<boost::asio::detail::epoll_reactor<false> 
>::run(boost::system::error_code&)
8: libtorrent::aux::session_impl::operator()()
9: 
boost::detail::thread_data<boost::reference_wrapper<libtorrent::aux::session_imp
l> >::run()
10: thread_proxy
11: 
12: clone

I then decided to compile it again without debug symbols to see if I could get 
any more information about the crash. When it was crashing on me it would be 
started by Gentoo's rc system and I had limited information about what was 
happening. So I ran it from the command line with some logging and left it on 
over night. Nothing happened, left it for a few more days and nothing. So I ran 
it again using Gentoo's rc script and again nothing happened after a few days 
of running.

I haven't tried to explicitly reproduce since then but it hasn't crashed on me. 
The interesting thing is that as far as I know nothing changed wrt to my system 
or network.

I might try to recompile it and see if I can reproduce it at some point in the 
near future but for the time being I cannot reproduce.

Patrik

Original comment by Gornic...@gmail.com on 6 Oct 2010 at 1:33

GoogleCodeExporter commented 9 years ago
Hi.

That assert has been haunting me for a long time now, probably almost a year. 
This last weekend I finally figured it out (probably for the 3rd time or so, 
but I'm pretty sure I got it this time). So, I don't think this assert will be 
triggered in the 0.15.4 release anymore.

Thanks for all your testing!

Original comment by arvid.no...@gmail.com on 6 Oct 2010 at 7:29

GoogleCodeExporter commented 9 years ago
I have fixed both asserts mentioned in this ticket. If they are still 
triggered, please re-open it.

Original comment by arvid.no...@gmail.com on 18 Oct 2010 at 1:08