Closed JenSte closed 3 years ago
Note that similar stack traces were posted in #68, however the program posted here crashes directly on the first message that it tries to send, independently of any timing between messages.
I also ran the test cases that come with azmq, all of them passed.
Hi, I'm having a problem that looks very similar:
==3828== Invalid read of size 8
==3828== at 0x4C367EE: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3828== by 0x26B2A8: boost::asio::detail::buffer_copy_1(boost::asio::mutable_buffer const&, boost::asio::const_buffer const&) (buffer.hpp:2180)
==3828== by 0x272A9F: unsigned long boost::asio::detail::buffer_copy<boost::asio::mutable_buffer const*, boost::asio::const_buffer const*>(boost::asio::detail::one_buffer, boost::asio::detail::one_buffer, boost::asio::mutable_buffer const*, boost::asio::mutable_buffer const*, boost::asio::const_buffer const*, boost::asio::const_buffer const*) (buffer.hpp:2189)
==3828== by 0x2704AE: unsigned long boost::asio::buffer_copy<boost::asio::mutable_buffers_1, boost::asio::const_buffer>(boost::asio::mutable_buffers_1 const&, boost::asio::const_buffer const&) (buffer.hpp:2371)
==3828== by 0x26DE8D: azmq::message::message(boost::asio::const_buffer const&) (message.hpp:79)
==3828== by 0x27C643: boost::enable_if<boost::has_range_const_iterator<boost::asio::const_buffers_1>, unsigned long>::type azmq::detail::socket_ops::send<boost::asio::const_buffers_1>(boost::asio::const_buffers_1 const&, std::unique_ptr<void, azmq::detail::socket_ops::socket_close>&, int, boost::system::error_code&) (socket_ops.hpp:284)
==3828== by 0x27A979: azmq::detail::send_buffer_op_base<boost::asio::const_buffers_1>::do_perform(azmq::detail::reactor_op*, std::unique_ptr<void, azmq::detail::socket_ops::socket_close>&) (send_op.hpp:68)
==3828== by 0x26E501: azmq::detail::reactor_op::do_perform(std::unique_ptr<void, azmq::detail::socket_ops::socket_close>&) (reactor_op.hpp:29)
==3828== by 0x26E93E: azmq::detail::socket_service::per_descriptor_data::perform_ops(boost::intrusive::list<azmq::detail::reactor_op, boost::intrusive::member_hook<azmq::detail::reactor_op, boost::intrusive::list_member_hook<>, &azmq::detail::reactor_op::member_hook_> >&, boost::system::error_code&) (socket_service.hpp:130)
==3828== by 0x26F34E: azmq::detail::socket_service::reactor_handler::operator()(boost::system::error_code, unsigned long) const (socket_service.hpp:625)
==3828== by 0x280B7C: boost::asio::detail::binder2<azmq::detail::socket_service::reactor_handler, boost::system::error_code, unsigned long>::operator()() (bind_handler.hpp:164)
==3828== by 0x2800D0: void boost::asio::asio_handler_invoke<boost::asio::detail::binder2<azmq::detail::socket_service::reactor_handler, boost::system::error_code, unsigned long> >(boost::asio::detail::binder2<azmq::detail::socket_service::reactor_handler, boost::system::error_code, unsigned long>&, ...) (handler_invoke_hook.hpp:69)
I tried to debug it a bit, and apparently the problem is that azmq is only storing a reference to the boost::asio::const_buffer
object in the internal send_buffer_op
object in its operation queue. So if the original boost::asio::const_buffer
object goes out of scope (such as when returning from start_sending()
in the above test program with crash = true), then azmq has a dangling reference that it tries to access the next time it tries to perform that operation.
The symptoms seem to vary; if the message::message()
constructor is given a completely invalid boost::asio::const_buffer
, and then anything can happen. Sometimes zmq_msg_init_size()
fails with ENOMEM due to a ridiculously huge buffer size, and sometimes it crashes in the boost::asio::buffer_copy()
, because the source buffer is an invalid address or null pointer.
Looking at reactive_socket_send_op_base
in boost::asio, it stores a copy of the ConstBufferSequence
it's given. Seems to me like azmq should do the same. Afterall, the boost::asio docs state that the "buffers object may be copied as necessary", and only mentions the caller retaining ownership of the "underlying memory blocks", not the buffer object. And interestingly enough, azmq's receive_buffer_op_base
is already doing it.
When a
boost::asio::buffer
is passed to a socket'sasync_send()
, a crash occurs somewhere deep down in asio/azmq when theazmq::message
is constructed. No crash occurs if theazmq::message
is constructed from the same underlying object and then passed toasync_send()
.Demonstration program:
As the crash only happens when the socket is connected, the following python script is started before the C++ program:
Then, when the C++ program is executed:
The same program, but compiled without address sanitizer. Instead the stack trace is generated with GDB:
When
bool crash = false;
is used inmain()
, the program works as expected and the python script fills the terminal withb'foo'
s.