zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.53k stars 2.34k forks source link

Thread Sanitizer reports data race for epoll_t in - Potential false positive? #4590

Closed Degoah closed 10 months ago

Degoah commented 10 months ago

Issue description

TSAN reports a sporadic data race in the zeromq calls of the AZMQ binding:

It seems to have to do something with a pointer written/read by epol_ctl zmq::epoll_t::add_fd Please see output below for more details.

The question is, whether this is a false positive reported by gcc's thread sanitizer or if this is an actual data race in libzmq/azmq

Environment

Minimal test code / Steps to reproduce the issue

The data race is not always reported by TSAN, it is more sporadically occurring, i.e. the execution of the example needs to be triggered several times to see TSAN reporting the issue.

#include <boost/asio.hpp>
#include <azmq/socket.hpp>
#include <iostream>
#include <string>
#include <thread>

int main()
{
    boost::asio::io_service io_service;

    azmq::pull_socket source_socket_in(io_service, true);
    azmq::push_socket source_socket_out(io_service, true);
    source_socket_in.connect("tcp://127.0.0.1:5551");
    source_socket_out.bind("tcp://127.0.0.1:5552");

    azmq::pull_socket sink_socket_in(io_service, true);
    azmq::push_socket sink_socket_out(io_service, true);
    sink_socket_in.connect("tcp://127.0.0.1:5552");
    sink_socket_out.bind("tcp://127.0.0.1:5551");

    // Publish messages
    std::vector<std::uint8_t> tx_data{1, 2};
    std::this_thread::sleep_for(std::chrono::milliseconds{150});

    source_socket_out.async_send(boost::asio::buffer(tx_data),
                                 [](const boost::system::error_code &error, size_t bytes_sent)
                                 {
                                     if (!error) {
                                         std::cout << "source_socket_out: number of bytes sent: " << bytes_sent << std::endl;
                                     } else {
                                         std::cout << "source_socket_out: " << error.message() << std::endl;
                                     }
                                 });

    auto t1 = std::thread([&sink_socket_in]
                          {
                              static std::vector<std::uint8_t> buffer(1500);
                              boost::system::error_code ec;
                              auto bytes_received = sink_socket_in.receive(boost::asio::buffer(buffer), 0, ec);

                              if (!ec) {
                                  std::cout << std::endl;
                                  std::cout << "sink_socket_in: number of bytes received: " << bytes_received << std::endl;
                                  std::vector<std::uint8_t> tx_buffer{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 18, 17};
                              } else {
                                  std::cout << "sink_socket_out: error: " << ec.message() << std::endl;
                              }
                          });

    io_service.run();
    t1.join();
    return 0;
}

What's the actual result? (include assertion message & call stack if applicable)

Data race (pid=29994)
Read of size 8 at 0x7ba000000100 by thread T2:
0x7f3e29854aa8 epoll_ctl 
0x7f3e2a434bbe zmq::epoll_t::add_fd 
Previous write of size 8 at 0x7ba000000100 by main thread:
0x7f3e29838681 socket 
0x7f3e2a4365d9 zmq::open_socket 
0x565315699a50 azmq::detail::socket_service::bind socket_service.hpp:366 
0x56531569bd05 azmq::socket::bind socket.hpp:156 
0x56531569bdda azmq::socket::bind socket.hpp:166 
0x565315686488 main main_mutex.cxx:68 
Location is file descriptor 16 created by main thread at:
0x7f3e29838681 socket 
0x7f3e2a4365d9 zmq::open_socket 
0x565315699a50 azmq::detail::socket_service::bind socket_service.hpp:366 
0x56531569bd05 azmq::socket::bind socket.hpp:156 
0x56531569bdda azmq::socket::bind socket.hpp:166 
0x565315686488 main main_mutex.cxx:68 
Thread T2 (tid=29997, running) created by main thread at:
0x7f3e2985f669 pthread_create 
0x7f3e2a4657f9 zmq::thread_t::start 
0x56531569902f azmq::detail::socket_service::per_descriptor_data::do_open socket_service.hpp:102 
0x565315699937 azmq::detail::socket_service::do_open socket_service.hpp:235 
0x56531569bbfd azmq::socket::socket socket.hpp:112 
0x5653156a2976 azmq::detail::specialized_socket::specialized_socket socket.hpp:699 
0x5653156862c0 main main_mutex.cxx:60 

What's the expected result?

No data race.

bluca commented 10 months ago

These are false positives, these sanitizers and valgrind generally do not understand the very custom thread model we have here