apache / datasketches-cpp

Core C++ Sketch Library
https://datasketches.apache.org
Apache License 2.0
225 stars 71 forks source link

make random utils thread safe #347

Closed FluorineDog closed 1 year ago

FluorineDog commented 1 year ago

when using kll_sketch in parallel, it will use the same random engine, making TSAN complain:

WARNING: ThreadSanitizer: data race (pid=1760545)
  Read of size 8 at 0x000032e79000 by thread T1097:
    #0 std::__1::mersenne_twister_engine<unsigned long, 32ul, 624ul, 397ul, 31ul, 2567483615ul, 11ul, 4294967295ul, 7ul, 2636928640ul, 15ul, 4022730752ul, 18ul, 1812433253ul>::operator()() ~/ByConity/contrib/libcxx/include/random:2394:25 (clickhouse+0x1eb2bdb9) (BuildId: 76fc9d59c0dfa515)
    #1 std::__1::independent_bits_engine<std::__1::mersenne_twister_engine<unsigned long, 32ul, 624ul, 397ul, 31ul, 2567483615ul, 11ul, 4294967295ul, 7ul, 2636928640ul, 15ul, 4022730752ul, 18ul, 1812433253ul>, 1ul, unsigned int>::__eval(std::__1::integral_constant<bool, true>) ~/ByConity/contrib/libcxx/include/random:3192:19 (clickhouse+0x24db3e8c) (BuildId: 76fc9d59c0dfa515)
    #2 std::__1::independent_bits_engine<std::__1::mersenne_twister_engine<unsigned long, 32ul, 624ul, 397ul, 31ul, 2567483615ul, 11ul, 4294967295ul, 7ul, 2636928640ul, 15ul, 4022730752ul, 18ul, 1812433253ul>, 1ul, unsigned int>::operator()() ~/ByConity/contrib/libcxx/include/random:3112:38 (clickhouse+0x24db3e05) (BuildId: 76fc9d59c0dfa515)
    #3 void datasketches::kll_helper::randomly_halve_up<unsigned long>(unsigned long*, unsigned int, unsigned int) ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_helper_impl.hpp:118:27 (clickhouse+0x24dc73d0) (BuildId: 76fc9d59c0dfa515)
    #4 datasketches::kll_sketch<unsigned long, std::__1::less<unsigned long>, datasketches::serde<unsigned long, void>, std::__1::allocator<unsigned long>>::compress_while_updating() ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_sketch_impl.hpp:728:5 (clickhouse+0x24dc6659) (BuildId: 76fc9d59c0dfa515)
    #5 datasketches::kll_sketch<unsigned long, std::__1::less<unsigned long>, datasketches::serde<unsigned long, void>, std::__1::allocator<unsigned long>>::internal_update() ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_sketch_impl.hpp:200:24 (clickhouse+0x24dc6353) (BuildId: 76fc9d59c0dfa515)
    #6 void datasketches::kll_sketch<unsigned long, std::__1::less<unsigned long>, datasketches::serde<unsigned long, void>, std::__1::allocator<unsigned long>>::update<unsigned long&>(unsigned long&) ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_sketch_impl.hpp:183:26 (clickhouse+0x24dc6011) (BuildId: 76fc9d59c0dfa515)
    #7 DB::Statistics::StatsKllSketchImpl<unsigned long>::update(unsigned long) ~/ByConity/src/Statistics/StatsKllSketchImpl.h:134:18 (clickhouse+0x24dc5f66) (BuildId: 76fc9d59c0dfa515)

  Previous write of size 8 at 0x000032e79000 by thread T1127:
    #0 std::__1::mersenne_twister_engine<unsigned long, 32ul, 624ul, 397ul, 31ul, 2567483615ul, 11ul, 4294967295ul, 7ul, 2636928640ul, 15ul, 4022730752ul, 18ul, 1812433253ul>::operator()() ~/ByConity/contrib/libcxx/include/random:2401:10 (clickhouse+0x1eb2bfc1) (BuildId: 76fc9d59c0dfa515)
    #1 std::__1::independent_bits_engine<std::__1::mersenne_twister_engine<unsigned long, 32ul, 624ul, 397ul, 31ul, 2567483615ul, 11ul, 4294967295ul, 7ul, 2636928640ul, 15ul, 4022730752ul, 18ul, 1812433253ul>, 1ul, unsigned int>::__eval(std::__1::integral_constant<bool, true>) ~/ByConity/contrib/libcxx/include/random:3192:19 (clickhouse+0x24db3e8c) (BuildId: 76fc9d59c0dfa515)
    #2 std::__1::independent_bits_engine<std::__1::mersenne_twister_engine<unsigned long, 32ul, 624ul, 397ul, 31ul, 2567483615ul, 11ul, 4294967295ul, 7ul, 2636928640ul, 15ul, 4022730752ul, 18ul, 1812433253ul>, 1ul, unsigned int>::operator()() ~/ByConity/contrib/libcxx/include/random:3112:38 (clickhouse+0x24db3e05) (BuildId: 76fc9d59c0dfa515)
    #3 void datasketches::kll_helper::randomly_halve_down<unsigned long>(unsigned long*, unsigned int, unsigned int) ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_helper_impl.hpp:102:27 (clickhouse+0x24dc7590) (BuildId: 76fc9d59c0dfa515)
    #4 datasketches::kll_sketch<unsigned long, std::__1::less<unsigned long>, datasketches::serde<unsigned long, void>, std::__1::allocator<unsigned long>>::compress_while_updating() ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_sketch_impl.hpp:730:5 (clickhouse+0x24dc6683) (BuildId: 76fc9d59c0dfa515)
    #5 datasketches::kll_sketch<unsigned long, std::__1::less<unsigned long>, datasketches::serde<unsigned long, void>, std::__1::allocator<unsigned long>>::internal_update() ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_sketch_impl.hpp:200:24 (clickhouse+0x24dc6353) (BuildId: 76fc9d59c0dfa515)
    #6 void datasketches::kll_sketch<unsigned long, std::__1::less<unsigned long>, datasketches::serde<unsigned long, void>, std::__1::allocator<unsigned long>>::update<unsigned long&>(unsigned long&) ~/ByConity/tsan/datasketches-prefix/include/DataSketches/kll_sketch_impl.hpp:183:26 (clickhouse+0x24dc6011) (BuildId: 76fc9d59c0dfa515)
    #7 DB::Statistics::StatsKllSketchImpl<unsigned long>::update(unsigned long) ~/ByConity/src/Statistics/StatsKllSketchImpl.h:134:18 (clickhouse+0x24dc5f66) (BuildId: 76fc9d59c0dfa515)

This patch will make the random_utils thread-safe.

FluorineDog commented 1 year ago

Sorry to be missing for a week, too busy to open github.com Code is updated, plz review and approve ci.

coveralls commented 1 year ago

Pull Request Test Coverage Report for Build 4404785640


Totals Coverage Status
Change from base Build 4369952580: 0.0%
Covered Lines: 4898
Relevant Lines: 5043

💛 - Coveralls