luceneplusplus / LucenePlusPlus

Lucene++ is an up to date C++ port of the popular Java Lucene library, a high-performance, full-featured text search engine.
luceneplusplus@googlegroups.com
Other
738 stars 232 forks source link

Uninitialized AttributeFactory instance #181

Closed kmatheussen closed 2 years ago

kmatheussen commented 2 years ago

Hi, thanks for the work on lucene++!

It's working great, except for a problem I have with something that looks like an uninitialized AttributeFactory variable in an AttributeSource instance.

Unfortunately it's not reproducible. Maybe it happens one of 100 times, I'm not sure.

It seems to crash at the same place every time.

It happens when I'm calling 'Lucene::newLucene()'. And I call this function from many threads at the same time.

Here's the final place it's crashing: image

Here's the place right before the final crash: image (factory->pn.ptr==NULL. factory->px looks like areal pointer, but the memory it's pointing to looks uninitialized (all bytes filled up with 0xdd).)

And here's a screenshot for the backtrace (I'm not very familiar with visual studio so I don't know how to get a proper text block) image

As a workaround I'm going to try adding a check so that lucene won't call factory->createInstance() if factory is uninitialized (plus an assertion of course). But is there anything else I can do to track down what goes wrong?

Thanks for your work.

Mythicsoft commented 2 years ago

Good catch. We've seen this as well with NumericField, but only ever in Release builds.

kmatheussen commented 2 years ago

I've made this change to Lucene now, and recompiled:

kjetil@ubuntu:/mnt/kjetil/LucenePlusPlus-rel_3.0.8/include/lucene++$ diff -u ~/LucenePlusPlus/include/lucene++/AttributeSource.h AttributeSource.h
--- /home/kjetil/LucenePlusPlus/include/lucene++/AttributeSource.h  2021-10-28 02:27:57.000000000 -0700
+++ AttributeSource.h   2021-11-26 05:42:48.000000000 -0800
@@ -72,11 +72,15 @@
         String className(ATTR::_getClassName());
         boost::shared_ptr<ATTR> attrImpl(boost::dynamic_pointer_cast<ATTR>(getAttribute(className)));
         if (!attrImpl) {
-            attrImpl = boost::dynamic_pointer_cast<ATTR>(factory->createInstance<ATTR>(className));
-            if (!attrImpl) {
-                boost::throw_exception(IllegalArgumentException(L"Could not instantiate implementing class for " + className));
-            }
-            addAttribute(className, attrImpl);
+           if (factory.get() == NULL){
+               boost::throw_exception(SomethingIsSeriouslyWrongException(L"Could not instantiate implementing class for " + className + L" (factory not initialized)"));
+           } else {
+               attrImpl = boost::dynamic_pointer_cast<ATTR>(factory->createInstance<ATTR>(className));
+               if (!attrImpl) {
+                   boost::throw_exception(IllegalArgumentException(L"Could not instantiate implementing class for " + className));
+               }
+               addAttribute(className, attrImpl);
+           }
         }
         return attrImpl;
kmatheussen commented 2 years ago

Added call to abort() in _DEBUG mode as well:

--- /home/kjetil/LucenePlusPlus/include/lucene++/AttributeSource.h  2021-10-28 02:27:57.000000000 -0700
+++ AttributeSource.h   2021-11-26 05:52:57.000000000 -0800
@@ -72,11 +72,18 @@
         String className(ATTR::_getClassName());
         boost::shared_ptr<ATTR> attrImpl(boost::dynamic_pointer_cast<ATTR>(getAttribute(className)));
         if (!attrImpl) {
-            attrImpl = boost::dynamic_pointer_cast<ATTR>(factory->createInstance<ATTR>(className));
-            if (!attrImpl) {
-                boost::throw_exception(IllegalArgumentException(L"Could not instantiate implementing class for " + className));
-            }
-            addAttribute(className, attrImpl);
+           if (factory.get() == NULL){
+#if _DEBUG
+               abort();
+#endif
+               boost::throw_exception(SomethingIsSeriouslyWrongException(L"Could not instantiate implementing class for " + className + L" (factory not initialized)"));
+           } else {
+               attrImpl = boost::dynamic_pointer_cast<ATTR>(factory->createInstance<ATTR>(className));
+               if (!attrImpl) {
+                   boost::throw_exception(IllegalArgumentException(L"Could not instantiate implementing class for " + className));
+               }
+               addAttribute(className, attrImpl);
+           }
         }
         return attrImpl;
     }
Mythicsoft commented 2 years ago

Could the reason be that during the AttributeSource constructor it creates a DEFAULT_ATTRIBUTE_FACTORY using a local static variable without any mutex synchronization? If multiple threads went to initialize the factory at the same time you might end up with some weird data.

kmatheussen commented 2 years ago

Yes, you seem to be correct about that. Good catch!

kmatheussen commented 2 years ago

I've added "run lucene test programs under tsan" on my TODO list. Maybe there's more bugs like this. tsan should have spotted that bug immediately.

kmatheussen commented 2 years ago

Got tsan hit on this problem, but nothing more as far as I can see:

kjetil@ubuntu:~/LucenePlusPlus/src/demo$ indexfiles/indexfiles . /tmp/luceneindex/
Indexing to directory: /tmp/luceneindex/...
Adding [1]: main.cpp
Adding [2]: Makefile
Adding [3]: CMakeLists.txt
==================
WARNING: ThreadSanitizer: data race (pid=18722)
  Read of size 8 at 0x7f9c4f299910 by thread T2:
    #0 Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:25 (liblucene++.so.0+0x8e42e4)
    #1 Lucene::AttributeSource::AttributeSource() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:34 (liblucene++.so.0+0x8e892b)
    #2 Lucene::TokenStream::TokenStream() /home/kjetil/LucenePlusPlus/src/core/analysis/TokenStream.cpp:12 (liblucene++.so.0+0x25bb1e)
    #3 Lucene::NumericTokenStream::NumericTokenStream(int) /home/kjetil/LucenePlusPlus/src/core/analysis/NumericTokenStream.cpp:26 (liblucene++.so.0+0x22cb40)
    #4 boost::detail::sp_if_not_array<Lucene::NumericTokenStream>::type boost::make_shared<Lucene::NumericTokenStream, int const&>(int const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (liblucene++.so.0+0x2e2e3d)
    #5 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newInstance<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x2e1a14)
    #6 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newLucene<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:118 (liblucene++.so.0+0x2e1a14)
    #7 Lucene::NumericField::NumericField(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, Lucene::AbstractField::Store, bool) /home/kjetil/LucenePlusPlus/src/core/document/NumericField.cpp:25 (liblucene++.so.0+0x2e1a14)
    #8 boost::detail::sp_if_not_array<Lucene::NumericField>::type boost::make_shared<Lucene::NumericField, wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (indexfiles+0x1037f)
    #9 boost::shared_ptr<Lucene::NumericField> Lucene::newInstance<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:47 (indexfiles+0x833e)
    #10 boost::shared_ptr<Lucene::NumericField> Lucene::newLucene<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:132 (indexfiles+0x833e)
    #11 fileDocument(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:39 (indexfiles+0x833e)
    #12 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:70 (indexfiles+0x8f29)
    #13 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x8f29)
    #14 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x8f29)
    #15 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x8f29)
    #16 operator() /usr/include/c++/9/thread:251 (indexfiles+0x8f29)
    #17 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x8f29)
    #18 <null> <null> (libstdc++.so.6+0xd6de3)

  Previous write of size 8 at 0x7f9c4f299910 by thread T1:
    #0 std::enable_if<std::__and_<std::__not_<std::__is_tuple_like<Lucene::AttributeFactory*> >, std::is_move_constructible<Lucene::AttributeFactory*>, std::is_move_assignable<Lucene::AttributeFactory*> >::value, void>::type std::swap<Lucene::AttributeFactory*>(Lucene::AttributeFactory*&, Lucene::AttributeFactory*&) /usr/include/c++/9/bits/move.h:195 (liblucene++.so.0+0x8e4429)
    #1 boost::shared_ptr<Lucene::AttributeFactory>::swap(boost::shared_ptr<Lucene::AttributeFactory>&) /usr/include/boost/smart_ptr/shared_ptr.hpp:766 (liblucene++.so.0+0x8e4429)
    #2 boost::shared_ptr<Lucene::AttributeFactory>& boost::shared_ptr<Lucene::AttributeFactory>::operator=<Lucene::DefaultAttributeFactory>(boost::shared_ptr<Lucene::DefaultAttributeFactory>&&) /usr/include/boost/smart_ptr/shared_ptr.hpp:667 (liblucene++.so.0+0x8e4429)
    #3 Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:26 (liblucene++.so.0+0x8e4429)
    #4 Lucene::AttributeSource::AttributeSource() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:34 (liblucene++.so.0+0x8e892b)
    #5 Lucene::TokenStream::TokenStream() /home/kjetil/LucenePlusPlus/src/core/analysis/TokenStream.cpp:12 (liblucene++.so.0+0x25bb1e)
    #6 Lucene::NumericTokenStream::NumericTokenStream(int) /home/kjetil/LucenePlusPlus/src/core/analysis/NumericTokenStream.cpp:26 (liblucene++.so.0+0x22cb40)
    #7 boost::detail::sp_if_not_array<Lucene::NumericTokenStream>::type boost::make_shared<Lucene::NumericTokenStream, int const&>(int const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (liblucene++.so.0+0x2e2e3d)
    #8 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newInstance<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x2e1a14)
    #9 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newLucene<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:118 (liblucene++.so.0+0x2e1a14)
    #10 Lucene::NumericField::NumericField(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, Lucene::AbstractField::Store, bool) /home/kjetil/LucenePlusPlus/src/core/document/NumericField.cpp:25 (liblucene++.so.0+0x2e1a14)
    #11 boost::detail::sp_if_not_array<Lucene::NumericField>::type boost::make_shared<Lucene::NumericField, wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (indexfiles+0x1037f)
    #12 boost::shared_ptr<Lucene::NumericField> Lucene::newInstance<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:47 (indexfiles+0x833e)
    #13 boost::shared_ptr<Lucene::NumericField> Lucene::newLucene<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:132 (indexfiles+0x833e)
    #14 fileDocument(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:39 (indexfiles+0x833e)
    #15 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:70 (indexfiles+0x8f29)
    #16 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x8f29)
    #17 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x8f29)
    #18 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x8f29)
    #19 operator() /usr/include/c++/9/thread:251 (indexfiles+0x8f29)
    #20 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x8f29)
    #21 <null> <null> (libstdc++.so.6+0xd6de3)

  Location is global 'Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY()::_DEFAULT_ATTRIBUTE_FACTORY' of size 16 at 0x7f9c4f299910 (liblucene++.so.0+0x000000b2d910)

  Thread T2 (tid=18725, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

  Thread T1 (tid=18724, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

SUMMARY: ThreadSanitizer: data race /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:25 in Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY()
==================
==================
WARNING: ThreadSanitizer: data race (pid=18722)
  Read of size 8 at 0x7f9c4f299918 by thread T2:
    #0 boost::detail::shared_count::shared_count(boost::detail::shared_count const&) /usr/include/boost/smart_ptr/detail/shared_count.hpp:433 (liblucene++.so.0+0x8e430c)
    #1 boost::shared_ptr<Lucene::AttributeFactory>::shared_ptr(boost::shared_ptr<Lucene::AttributeFactory> const&) /usr/include/boost/smart_ptr/shared_ptr.hpp:422 (liblucene++.so.0+0x8e430c)
    #2 Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:29 (liblucene++.so.0+0x8e430c)
    #3 Lucene::AttributeSource::AttributeSource() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:34 (liblucene++.so.0+0x8e892b)
    #4 Lucene::TokenStream::TokenStream() /home/kjetil/LucenePlusPlus/src/core/analysis/TokenStream.cpp:12 (liblucene++.so.0+0x25bb1e)
    #5 Lucene::NumericTokenStream::NumericTokenStream(int) /home/kjetil/LucenePlusPlus/src/core/analysis/NumericTokenStream.cpp:26 (liblucene++.so.0+0x22cb40)
    #6 boost::detail::sp_if_not_array<Lucene::NumericTokenStream>::type boost::make_shared<Lucene::NumericTokenStream, int const&>(int const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (liblucene++.so.0+0x2e2e3d)
    #7 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newInstance<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x2e1a14)
    #8 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newLucene<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:118 (liblucene++.so.0+0x2e1a14)
    #9 Lucene::NumericField::NumericField(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, Lucene::AbstractField::Store, bool) /home/kjetil/LucenePlusPlus/src/core/document/NumericField.cpp:25 (liblucene++.so.0+0x2e1a14)
    #10 boost::detail::sp_if_not_array<Lucene::NumericField>::type boost::make_shared<Lucene::NumericField, wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (indexfiles+0x1037f)
    #11 boost::shared_ptr<Lucene::NumericField> Lucene::newInstance<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:47 (indexfiles+0x833e)
    #12 boost::shared_ptr<Lucene::NumericField> Lucene::newLucene<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:132 (indexfiles+0x833e)
    #13 fileDocument(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:39 (indexfiles+0x833e)
    #14 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:70 (indexfiles+0x8f29)
    #15 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x8f29)
    #16 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x8f29)
    #17 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x8f29)
    #18 operator() /usr/include/c++/9/thread:251 (indexfiles+0x8f29)
    #19 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x8f29)
    #20 <null> <null> (libstdc++.so.6+0xd6de3)

  Previous write of size 8 at 0x7f9c4f299918 by thread T1:
    [failed to restore the stack]

  Location is global 'Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY()::_DEFAULT_ATTRIBUTE_FACTORY' of size 16 at 0x7f9c4f299910 (liblucene++.so.0+0x000000b2d918)

  Thread T2 (tid=18725, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

  Thread T1 (tid=18724, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

SUMMARY: ThreadSanitizer: data race /usr/include/boost/smart_ptr/detail/shared_count.hpp:433 in boost::detail::shared_count::shared_count(boost::detail::shared_count const&)
==================
==================
WARNING: ThreadSanitizer: data race (pid=18722)
  Atomic write of size 4 at 0x7b1800003068 by thread T2:
    #0 __tsan_atomic32_fetch_add ../../../../src/libsanitizer/tsan/tsan_interface_atomic.cpp:615 (libtsan.so.0+0x7f0a9)
    #1 std::__atomic_base<int>::fetch_add(int, std::memory_order) /usr/include/c++/9/bits/atomic_base.h:541 (liblucene++.so.0+0x8e4337)
    #2 boost::detail::atomic_increment(std::atomic<int>*) /usr/include/boost/smart_ptr/detail/sp_counted_base_std_atomic.hpp:32 (liblucene++.so.0+0x8e4337)
    #3 boost::detail::sp_counted_base::add_ref_copy() /usr/include/boost/smart_ptr/detail/sp_counted_base_std_atomic.hpp:100 (liblucene++.so.0+0x8e4337)
    #4 boost::detail::shared_count::shared_count(boost::detail::shared_count const&) /usr/include/boost/smart_ptr/detail/shared_count.hpp:438 (liblucene++.so.0+0x8e4337)
    #5 boost::shared_ptr<Lucene::AttributeFactory>::shared_ptr(boost::shared_ptr<Lucene::AttributeFactory> const&) /usr/include/boost/smart_ptr/shared_ptr.hpp:422 (liblucene++.so.0+0x8e4337)
    #6 Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:29 (liblucene++.so.0+0x8e4337)
    #7 Lucene::AttributeSource::AttributeSource() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:34 (liblucene++.so.0+0x8e892b)
    #8 Lucene::TokenStream::TokenStream() /home/kjetil/LucenePlusPlus/src/core/analysis/TokenStream.cpp:12 (liblucene++.so.0+0x25bb1e)
    #9 Lucene::NumericTokenStream::NumericTokenStream(int) /home/kjetil/LucenePlusPlus/src/core/analysis/NumericTokenStream.cpp:26 (liblucene++.so.0+0x22cb40)
    #10 boost::detail::sp_if_not_array<Lucene::NumericTokenStream>::type boost::make_shared<Lucene::NumericTokenStream, int const&>(int const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (liblucene++.so.0+0x2e2e3d)
    #11 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newInstance<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x2e1a14)
    #12 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newLucene<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:118 (liblucene++.so.0+0x2e1a14)
    #13 Lucene::NumericField::NumericField(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, Lucene::AbstractField::Store, bool) /home/kjetil/LucenePlusPlus/src/core/document/NumericField.cpp:25 (liblucene++.so.0+0x2e1a14)
    #14 boost::detail::sp_if_not_array<Lucene::NumericField>::type boost::make_shared<Lucene::NumericField, wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (indexfiles+0x1037f)
    #15 boost::shared_ptr<Lucene::NumericField> Lucene::newInstance<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:47 (indexfiles+0x833e)
    #16 boost::shared_ptr<Lucene::NumericField> Lucene::newLucene<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:132 (indexfiles+0x833e)
    #17 fileDocument(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:39 (indexfiles+0x833e)
    #18 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:70 (indexfiles+0x8f29)
    #19 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x8f29)
    #20 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x8f29)
    #21 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x8f29)
    #22 operator() /usr/include/c++/9/thread:251 (indexfiles+0x8f29)
    #23 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x8f29)
    #24 <null> <null> (libstdc++.so.6+0xd6de3)

  Previous write of size 4 at 0x7b1800003068 by thread T1:
    [failed to restore the stack]

  Location is heap block of size 88 at 0x7b1800003060 allocated by thread T1:
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 boost::detail::shared_count::shared_count<Lucene::DefaultAttributeFactory*, boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> >(Lucene::DefaultAttributeFactory*, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> >) /usr/include/boost/smart_ptr/detail/shared_count.hpp:214 (liblucene++.so.0+0x8ece5b)
    #2 boost::shared_ptr<Lucene::DefaultAttributeFactory>::shared_ptr<Lucene::DefaultAttributeFactory, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> > >(Lucene::DefaultAttributeFactory*, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> >) /usr/include/boost/smart_ptr/shared_ptr.hpp:388 (liblucene++.so.0+0x8ece5b)
    #3 boost::detail::sp_if_not_array<Lucene::DefaultAttributeFactory>::type boost::make_shared<Lucene::DefaultAttributeFactory>() /usr/include/boost/smart_ptr/make_shared_object.hpp:250 (liblucene++.so.0+0x8ece5b)
    #4 boost::shared_ptr<Lucene::DefaultAttributeFactory> Lucene::newLucene<Lucene::DefaultAttributeFactory>() /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:110 (liblucene++.so.0+0x8e43bb)
    #5 Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:26 (liblucene++.so.0+0x8e43bb)
    #6 Lucene::AttributeSource::AttributeSource() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:34 (liblucene++.so.0+0x8e892b)
    #7 Lucene::TokenStream::TokenStream() /home/kjetil/LucenePlusPlus/src/core/analysis/TokenStream.cpp:12 (liblucene++.so.0+0x25bb1e)
    #8 Lucene::NumericTokenStream::NumericTokenStream(int) /home/kjetil/LucenePlusPlus/src/core/analysis/NumericTokenStream.cpp:26 (liblucene++.so.0+0x22cb40)
    #9 boost::detail::sp_if_not_array<Lucene::NumericTokenStream>::type boost::make_shared<Lucene::NumericTokenStream, int const&>(int const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (liblucene++.so.0+0x2e2e3d)
    #10 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newInstance<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x2e1a14)
    #11 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newLucene<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:118 (liblucene++.so.0+0x2e1a14)
    #12 Lucene::NumericField::NumericField(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, Lucene::AbstractField::Store, bool) /home/kjetil/LucenePlusPlus/src/core/document/NumericField.cpp:25 (liblucene++.so.0+0x2e1a14)
    #13 boost::detail::sp_if_not_array<Lucene::NumericField>::type boost::make_shared<Lucene::NumericField, wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (indexfiles+0x1037f)
    #14 boost::shared_ptr<Lucene::NumericField> Lucene::newInstance<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:47 (indexfiles+0x833e)
    #15 boost::shared_ptr<Lucene::NumericField> Lucene::newLucene<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:132 (indexfiles+0x833e)
    #16 fileDocument(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:39 (indexfiles+0x833e)
    #17 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:70 (indexfiles+0x8f29)
    #18 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x8f29)
    #19 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x8f29)
    #20 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x8f29)
    #21 operator() /usr/include/c++/9/thread:251 (indexfiles+0x8f29)
    #22 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x8f29)
    #23 <null> <null> (libstdc++.so.6+0xd6de3)

  Thread T2 (tid=18725, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

  Thread T1 (tid=18724, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

SUMMARY: ThreadSanitizer: data race /usr/include/c++/9/bits/atomic_base.h:541 in std::__atomic_base<int>::fetch_add(int, std::memory_order)
==================
==================
WARNING: ThreadSanitizer: data race (pid=18722)
  Read of size 8 at 0x7b1800003080 by thread T2:
    #0 boost::shared_ptr<Lucene::Attribute> Lucene::AttributeFactory::createInstance<Lucene::TermAttribute>(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/include/lucene++/AttributeSource.h:29 (liblucene++.so.0+0x200c43)
    #1 boost::shared_ptr<Lucene::TermAttribute> Lucene::AttributeSource::addAttribute<Lucene::TermAttribute>() /home/kjetil/LucenePlusPlus/include/lucene++/AttributeSource.h:75 (liblucene++.so.0+0x200c43)
    #2 Lucene::NumericTokenStream::NumericTokenStream(int) /home/kjetil/LucenePlusPlus/src/core/analysis/NumericTokenStream.cpp:29 (liblucene++.so.0+0x22cc5c)
    #3 boost::detail::sp_if_not_array<Lucene::NumericTokenStream>::type boost::make_shared<Lucene::NumericTokenStream, int const&>(int const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (liblucene++.so.0+0x2e2e3d)
    #4 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newInstance<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x2e1a14)
    #5 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newLucene<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:118 (liblucene++.so.0+0x2e1a14)
    #6 Lucene::NumericField::NumericField(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, Lucene::AbstractField::Store, bool) /home/kjetil/LucenePlusPlus/src/core/document/NumericField.cpp:25 (liblucene++.so.0+0x2e1a14)
    #7 boost::detail::sp_if_not_array<Lucene::NumericField>::type boost::make_shared<Lucene::NumericField, wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (indexfiles+0x1037f)
    #8 boost::shared_ptr<Lucene::NumericField> Lucene::newInstance<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:47 (indexfiles+0x833e)
    #9 boost::shared_ptr<Lucene::NumericField> Lucene::newLucene<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:132 (indexfiles+0x833e)
    #10 fileDocument(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:39 (indexfiles+0x833e)
    #11 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:70 (indexfiles+0x8f29)
    #12 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x8f29)
    #13 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x8f29)
    #14 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x8f29)
    #15 operator() /usr/include/c++/9/thread:251 (indexfiles+0x8f29)
    #16 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x8f29)
    #17 <null> <null> (libstdc++.so.6+0xd6de3)

  Previous write of size 8 at 0x7b1800003080 by thread T1:
    [failed to restore the stack]

  Location is heap block of size 88 at 0x7b1800003060 allocated by thread T1:
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 boost::detail::shared_count::shared_count<Lucene::DefaultAttributeFactory*, boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> >(Lucene::DefaultAttributeFactory*, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> >) /usr/include/boost/smart_ptr/detail/shared_count.hpp:214 (liblucene++.so.0+0x8ece5b)
    #2 boost::shared_ptr<Lucene::DefaultAttributeFactory>::shared_ptr<Lucene::DefaultAttributeFactory, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> > >(Lucene::DefaultAttributeFactory*, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<Lucene::DefaultAttributeFactory> >) /usr/include/boost/smart_ptr/shared_ptr.hpp:388 (liblucene++.so.0+0x8ece5b)
    #3 boost::detail::sp_if_not_array<Lucene::DefaultAttributeFactory>::type boost::make_shared<Lucene::DefaultAttributeFactory>() /usr/include/boost/smart_ptr/make_shared_object.hpp:250 (liblucene++.so.0+0x8ece5b)
    #4 boost::shared_ptr<Lucene::DefaultAttributeFactory> Lucene::newLucene<Lucene::DefaultAttributeFactory>() /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:110 (liblucene++.so.0+0x8e43bb)
    #5 Lucene::AttributeFactory::DEFAULT_ATTRIBUTE_FACTORY() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:26 (liblucene++.so.0+0x8e43bb)
    #6 Lucene::AttributeSource::AttributeSource() /home/kjetil/LucenePlusPlus/src/core/util/AttributeSource.cpp:34 (liblucene++.so.0+0x8e892b)
    #7 Lucene::TokenStream::TokenStream() /home/kjetil/LucenePlusPlus/src/core/analysis/TokenStream.cpp:12 (liblucene++.so.0+0x25bb1e)
    #8 Lucene::NumericTokenStream::NumericTokenStream(int) /home/kjetil/LucenePlusPlus/src/core/analysis/NumericTokenStream.cpp:26 (liblucene++.so.0+0x22cb40)
    #9 boost::detail::sp_if_not_array<Lucene::NumericTokenStream>::type boost::make_shared<Lucene::NumericTokenStream, int const&>(int const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (liblucene++.so.0+0x2e2e3d)
    #10 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newInstance<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x2e1a14)
    #11 boost::shared_ptr<Lucene::NumericTokenStream> Lucene::newLucene<Lucene::NumericTokenStream, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:118 (liblucene++.so.0+0x2e1a14)
    #12 Lucene::NumericField::NumericField(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, Lucene::AbstractField::Store, bool) /home/kjetil/LucenePlusPlus/src/core/document/NumericField.cpp:25 (liblucene++.so.0+0x2e1a14)
    #13 boost::detail::sp_if_not_array<Lucene::NumericField>::type boost::make_shared<Lucene::NumericField, wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /usr/include/boost/smart_ptr/make_shared_object.hpp:256 (indexfiles+0x1037f)
    #14 boost::shared_ptr<Lucene::NumericField> Lucene::newInstance<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:47 (indexfiles+0x833e)
    #15 boost::shared_ptr<Lucene::NumericField> Lucene::newLucene<Lucene::NumericField, wchar_t [16], Lucene::AbstractField::Store, Lucene::AbstractField::Index>(wchar_t const (&) [16], Lucene::AbstractField::Store const&, Lucene::AbstractField::Index const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:132 (indexfiles+0x833e)
    #16 fileDocument(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:39 (indexfiles+0x833e)
    #17 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:70 (indexfiles+0x8f29)
    #18 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x8f29)
    #19 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x8f29)
    #20 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x8f29)
    #21 operator() /usr/include/c++/9/thread:251 (indexfiles+0x8f29)
    #22 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x8f29)
    #23 <null> <null> (libstdc++.so.6+0xd6de3)

  Thread T2 (tid=18725, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

  Thread T1 (tid=18724, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 indexDocs(boost::shared_ptr<Lucene::IndexWriter> const&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:64 (indexfiles+0x94e6)
    #3 main /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:112 (indexfiles+0x7320)

SUMMARY: ThreadSanitizer: data race /home/kjetil/LucenePlusPlus/include/lucene++/AttributeSource.h:29 in boost::shared_ptr<Lucene::Attribute> Lucene::AttributeFactory::createInstance<Lucene::TermAttribute>(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&)
==================
Adding [4]: searchfiles
Adding [5]: searchfiles.vcxproj.filters
Adding [6]: searchfiles.vcproj
Adding [7]: searchfiles.vcxproj
Adding [8]: cmake_install.cmake
Adding [9]: progress.marks
kmatheussen commented 2 years ago

Patch to provoke the tsan hit:

diff --git a/src/demo/indexfiles/main.cpp b/src/demo/indexfiles/main.cpp
index e6911f4..9221cc1 100644
--- a/src/demo/indexfiles/main.cpp
+++ b/src/demo/indexfiles/main.cpp
@@ -34,6 +34,14 @@ DocumentPtr fileDocument(const String& docFile) {
     doc->add(newLucene<Field>(L"modified", DateTools::timeToString(FileUtils::fileModified(docFile), DateTools::RESOLUTION_MINUTE),
                               Field::STORE_YES, Field::INDEX_NOT_ANALYZED));

+    Lucene::NumericFieldPtr field = Lucene::newLucene<Lucene::NumericField>(L"modified_number",
+                                                                           Lucene::Field::STORE_YES,
+                                                                           Lucene::Field::INDEX_NOT_ANALYZED);
+
+    field->setLongValue(FileUtils::fileModified(docFile));
+    
+    doc->add(field);
+
     // Add the contents of the file to a field named "contents".  Specify a Reader, so that the text of the file is
     // tokenized and indexed, but not stored.  Note that FileReader expects the file to be in the system's default
     // encoding.  If that's not the case searching for special characters will fail.
@@ -48,19 +56,27 @@ void indexDocs(const IndexWriterPtr& writer, const String& sourceDir) {
         return;
     }

+    std::vector<std::thread> threads;
+    
     for (HashSet<String>::iterator dirFile = dirList.begin(); dirFile != dirList.end(); ++dirFile) {
-        String docFile(FileUtils::joinPath(sourceDir, *dirFile));
-        if (FileUtils::isDirectory(docFile)) {
-            indexDocs(writer, docFile);
-        } else {
-            std::wcout << L"Adding [" << ++docNumber << L"]: " << *dirFile << L"\n";
-
-            try {
-                writer->addDocument(fileDocument(docFile));
-            } catch (FileNotFoundException&) {
-            }
-        }
+           String docFile(FileUtils::joinPath(sourceDir, *dirFile));
+           if (FileUtils::isDirectory(docFile)) {
+                   indexDocs(writer, docFile);
+           } else {
+                   std::wcout << L"Adding [" << ++docNumber << L"]: " << *dirFile << L"\n";
+                   
+                   threads.push_back(std::thread([docFile, writer](){
+                                                         try {
+                                                                 writer->addDocument(fileDocument(docFile));
+                                                         } catch (FileNotFoundException&) {
+                                                         }
+                                                 }));
+           }
     }
+           
+
+    for(auto &thread : threads)
+           thread.join();
 }
kmatheussen commented 2 years ago

I've patched up the initial tsan hit with a mutex now.

But after that I get a billion other tsan hits instead. But I don't understand them. Here's the output:

Thread T3:

WARNING: ThreadSanitizer: data race (pid=30910)
  Read of size 8 at 0x7f1421058610 by thread T3:
    #0 Lucene::Array<wchar_t>::get() const /home/kjetil/LucenePlusPlus/include/lucene++/Array.h:84 (liblucene++.so.0+0x1e6b6c)
    #1 Lucene::StandardTokenizerImpl::ZZ_CMAP() /home/kjetil/LucenePlusPlus/src/core/analysis/standard/StandardTokenizerImpl.cpp:233 (liblucene++.so.0+0x24b98d)
    #2 Lucene::StandardTokenizerImpl::getNextToken() /home/kjetil/LucenePlusPlus/src/core/analysis/standard/StandardTokenizerImpl.cpp:431 (liblucene++.so.0+0x24cce2)
(...)

Thread T2:

  Previous write of size 8 at 0x7f1421058610 by thread T2:
    #0 Lucene::Array<wchar_t>::operator=(Lucene::Array<wchar_t>&&) /home/kjetil/LucenePlusPlus/include/lucene++/Array.h:47 (liblucene++.so.0+0x1e5ba3)
    #1 Lucene::StandardTokenizerImpl::ZZ_CMAP_INIT() /home/kjetil/LucenePlusPlus/src/core/analysis/standard/StandardTokenizerImpl.cpp:207 (liblucene++.so.0+0x24b86f)
    #2 decltype (((forward<void (*)()>)({parm#1}))()) boost::detail::invoke<void (*)()>(void (*&&)()) /usr/include/boost/thread/detail/invoke.hpp:134 (liblucene++.so.0+0x24d991)
    #3 void boost::call_once<void (&)()>(boost::once_flag&, void (&)()) /usr/include/boost/thread/pthread/once_atomic.hpp:127 (liblucene++.so.0+0x24d4b2)
    #4 Lucene::StandardTokenizerImpl::ZZ_CMAP() /home/kjetil/LucenePlusPlus/src/core/analysis/standard/StandardTokenizerImpl.cpp:232 (liblucene++.so.0+0x24b981)
    #5 Lucene::StandardTokenizerImpl::getNextToken() /home/kjetil/LucenePlusPlus/src/core/analysis/standard/StandardTokenizerImpl.cpp:431 (liblucene++.so.0+0x24cce2
(...)

The reason I don't understand this hit is that the code in question look like this:

const wchar_t* StandardTokenizerImpl::ZZ_CMAP() {
    static boost::once_flag once = BOOST_ONCE_INIT;
    boost::call_once(once, ZZ_CMAP_INIT);
    return _ZZ_CMAP.get();
}

Isn't boost::call_once a blocking operation? I.e. shouldn't all subsequent threads wait until the first thread is finished calling ZZ_CMAP_INIT?

Mythicsoft commented 2 years ago

I agree, it should be thread safe, boost docs:

"Calls to call_once on the same once_flag object are serialized."

kmatheussen commented 2 years ago

I didn't see any .cpp boost files in the backtrace though, but I guess the reason is that my boost is not compiled with tsan. I'm compiling up boost with tsan now.

kmatheussen commented 2 years ago

Yes, compiling libbost with tsan fixed it. Afterwards there was only one tsan hit. In this function:

const Collection<String> StandardTokenizer::TOKEN_TYPES() {
    static Collection<String> _TOKEN_TYPES;
    if (!_TOKEN_TYPES) {
        _TOKEN_TYPES = newCollection<String>(
                           L"<ALPHANUM>",
                           L"<APOSTROPHE>",
                           L"<ACRONYM>",
                           L"<COMPANY>",
                           L"<EMAIL>",
                           L"<HOST>",
                           L"<NUM>",
                           L"<CJ>",
                           L"<ACRONYM_DEP>"
                       );
    }
    return _TOKEN_TYPES;
}

After temporarily patching that function up with a mutex, I don't get any other tsan hits.

kmatheussen commented 2 years ago

After tweaking the indexer demo program a little bit, I managed to provoke another tsan hit:

Thread T2:

WARNING: ThreadSanitizer: data race (pid=85741)
  Write of size 8 at 0x7b100000cdc0 by thread T2:
    #0 operator delete(void*) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:126 (libtsan.so.0+0x8b2c8)
    #1 std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::_M_assign(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) <null> (libstdc++.so.6+0x15aa36)
    #2 Lucene::StandardFilter::incrementToken() /home/kjetil/LucenePlusPlus/src/core/analysis/standard/StandardFilter.cpp:48 (liblucene++.so.0+0x23bfa1)
    #3 Lucene::LowerCaseFilter::incrementToken() /home/kjetil/LucenePlusPlus/src/core/analysis/LowerCaseFilter.cpp:22 (liblucene++.so.0+0x1fbef3)
    #4 Lucene::StopFilter::incrementToken() /home/kjetil/LucenePlusPlus/src/core/analysis/StopFilter.cpp:39 (liblucene++.so.0+0x21bc14)
    #5 Lucene::DocInverterPerField::processFields(Lucene::Collection<boost::shared_ptr<Lucene::Fieldable> >, int) /home/kjetil/LucenePlusPlus/src/core/index/DocInverterPerField.cpp:125 (liblucene++.so.0+0x3734de)
    #6 Lucene::DocFieldProcessorPerThread::processDocument() /home/kjetil/LucenePlusPlus/src/core/index/DocFieldProcessorPerThread.cpp:219 (liblucene++.so.0+0x355636)
    #7 Lucene::DocumentsWriter::updateDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&, boost::shared_ptr<Lucene::Term> const&) /home/kjetil/LucenePlusPlus/src/core/index/DocumentsWriter.cpp:668 (liblucene++.so.0+0x37f95f)
    #8 Lucene::DocumentsWriter::addDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&) /home/kjetil/LucenePlusPlus/src/core/index/DocumentsWriter.cpp:647 (liblucene++.so.0+0x37f717)
    #9 Lucene::IndexWriter::addDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&) /home/kjetil/LucenePlusPlus/src/core/index/IndexWriter.cpp:912 (liblucene++.so.0+0x47a025)
    #10 Lucene::IndexWriter::addDocument(boost::shared_ptr<Lucene::Document> const&) /home/kjetil/LucenePlusPlus/src/core/index/IndexWriter.cpp:902 (liblucene++.so.0+0x479f30)
    #11 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:72 (indexfiles+0x80e0)
    #12 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x97bd)
    #13 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x9710)
    #14 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x9660)
    #15 operator() /usr/include/c++/9/thread:251 (indexfiles+0x95f1)
    #16 _M_run /usr/include/c++/9/thread:195 (ind

Thread T6:

  Previous write of size 8 at 0x7b100000cdc0 by thread T6:
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::_M_assign(std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) <null> (libstdc++.so.6+0x15aa26)
    #2 Lucene::StandardFilter::incrementToken() /home/kjetil/LucenePlusPlus/src/core/analysis/standard/StandardFilter.cpp:48 (liblucene++.so.0+0x23bfa1)
    #3 Lucene::LowerCaseFilter::incrementToken() /home/kjetil/LucenePlusPlus/src/core/analysis/LowerCaseFilter.cpp:22 (liblucene++.so.0+0x1fbef3)
    #4 Lucene::StopFilter::incrementToken() /home/kjetil/LucenePlusPlus/src/core/analysis/StopFilter.cpp:39 (liblucene++.so.0+0x21bc14)
    #5 Lucene::DocInverterPerField::processFields(Lucene::Collection<boost::shared_ptr<Lucene::Fieldable> >, int) /home/kjetil/LucenePlusPlus/src/core/index/DocInverterPerField.cpp:125 (liblucene++.so.0+0x3734de)
    #6 Lucene::DocFieldProcessorPerThread::processDocument() /home/kjetil/LucenePlusPlus/src/core/index/DocFieldProcessorPerThread.cpp:219 (liblucene++.so.0+0x355636)
    #7 Lucene::DocumentsWriter::updateDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&, boost::shared_ptr<Lucene::Term> const&) /home/kjetil/LucenePlusPlus/src/core/index/DocumentsWriter.cpp:668 (liblucene++.so.0+0x37f95f)
    #8 Lucene::DocumentsWriter::addDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&) /home/kjetil/LucenePlusPlus/src/core/index/DocumentsWriter.cpp:647 (liblucene++.so.0+0x37f717)
    #9 Lucene::IndexWriter::addDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&) /home/kjetil/LucenePlusPlus/src/core/index/IndexWriter.cpp:912 (liblucene++.so.0+0x47a025)
    #10 Lucene::IndexWriter::addDocument(boost::shared_ptr<Lucene::Document> const&) /home/kjetil/LucenePlusPlus/src/core/index/IndexWriter.cpp:902 (liblucene++.so.0+0x479f30)
    #11 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:72 (indexfiles+0x80e0)
    #12 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x97bd)
    #13 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x9710)
    #14 _M_invoke<0> /usr/include/c++/9/thread:244 (indexfiles+0x9660)
    #15 operator() /usr/include/c++/9/thread:251 (indexfiles+0x95f1)
    #16 _M_run /usr/include/c++/9/thread:195 (indexfiles+0x9598)
    #17 <null> <null> (libstdc++.so.6+0xd6de3)

Additional information:

  Location is heap block of size 56 at 0x7b100000cdc0 allocated by thread T1:
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 boost::detail::shared_count::shared_count<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > >*, boost::detail::sp_ms_deleter<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > > > >(std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > >*, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > > > >) /home/kjetil/site/include/boost/smart_ptr/detail/shared_count.hpp:219 (liblucene++.so.0+0x5af635)
    #2 boost::shared_ptr<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > > >::shared_ptr<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > >, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > > > > >(std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > >*, boost::detail::sp_inplace_tag<boost::detail::sp_ms_deleter<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > > > >) /home/kjetil/site/include/boost/smart_ptr/shared_ptr.hpp:384 (liblucene++.so.0+0x5add68)
    #3 boost::detail::sp_if_not_array<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > > >::type boost::make_shared<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > >, int const&>(int const&) /home/kjetil/site/include/boost/smart_ptr/make_shared_object.hpp:250 (liblucene++.so.0+0x5ac55a)
    #4 boost::shared_ptr<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > > > Lucene::newInstance<std::vector<boost::shared_ptr<Lucene::RawPostingList>, std::allocator<boost::shared_ptr<Lucene::RawPostingList> > >, int>(int const&) /home/kjetil/LucenePlusPlus/include/lucene++/LuceneFactory.h:29 (liblucene++.so.0+0x5ab6c7)
    #5 Lucene::Collection<boost::shared_ptr<Lucene::RawPostingList> >::newInstance(int) /home/kjetil/LucenePlusPlus/include/lucene++/Collection.h:35 (liblucene++.so.0+0x5aa18b)
    #6 Lucene::TermsHashPerField::rehashPostings(int) /home/kjetil/LucenePlusPlus/src/core/index/TermsHashPerField.cpp:456 (liblucene++.so.0+0x5b9726)
    #7 Lucene::TermsHashPerField::add() /home/kjetil/LucenePlusPlus/src/core/index/TermsHashPerField.cpp:379 (liblucene++.so.0+0x5b8bc9)
    #8 Lucene::DocInverterPerField::processFields(Lucene::Collection<boost::shared_ptr<Lucene::Fieldable> >, int) /home/kjetil/LucenePlusPlus/src/core/index/DocInverterPerField.cpp:157 (liblucene++.so.0+0x373744)
    #9 Lucene::DocFieldProcessorPerThread::processDocument() /home/kjetil/LucenePlusPlus/src/core/index/DocFieldProcessorPerThread.cpp:219 (liblucene++.so.0+0x355636)
    #10 Lucene::DocumentsWriter::updateDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&, boost::shared_ptr<Lucene::Term> const&) /home/kjetil/LucenePlusPlus/src/core/index/DocumentsWriter.cpp:668 (liblucene++.so.0+0x37f95f)
    #11 Lucene::DocumentsWriter::addDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&) /home/kjetil/LucenePlusPlus/src/core/index/DocumentsWriter.cpp:647 (liblucene++.so.0+0x37f717)
    #12 Lucene::IndexWriter::addDocument(boost::shared_ptr<Lucene::Document> const&, boost::shared_ptr<Lucene::Analyzer> const&) /home/kjetil/LucenePlusPlus/src/core/index/IndexWriter.cpp:912 (liblucene++.so.0+0x47a025)
    #13 Lucene::IndexWriter::addDocument(boost::shared_ptr<Lucene::Document> const&) /home/kjetil/LucenePlusPlus/src/core/index/IndexWriter.cpp:902 (liblucene++.so.0+0x479f30)
    #14 operator() /home/kjetil/LucenePlusPlus/src/demo/indexfiles/main.cpp:72 (indexfiles+0x80e0)
    #15 __invoke_impl<void, indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:60 (indexfiles+0x97bd)
    #16 __invoke<indexDocs(const IndexWriterPtr&, const String&)::<lambda()> > /usr/include/c++/9/bits/invoke.h:95 (indexfiles+0x9710)
    #17 _M_invoke<0> /usr/include/c++/9/thread

This one I don't understand, the code is not immediately claer, and it doesn't always happen eigher. But adding a mutex at least seems to stop the tsan hits.

kmatheussen commented 2 years ago

(I added a mutex to the top of StandardFilter::incrementToken.)

kmatheussen commented 2 years ago

Sorry, I should have looked at the code 1 minute more. The race condition happens in these two functions, which are not thread safe:

const String& StandardFilter::APOSTROPHE_TYPE() {
    static String _APOSTROPHE_TYPE;
    if (_APOSTROPHE_TYPE.empty()) {
        _APOSTROPHE_TYPE = StandardTokenizer::TOKEN_TYPES()[StandardTokenizer::APOSTROPHE];
    }
    return _APOSTROPHE_TYPE;
}

const String& StandardFilter::ACRONYM_TYPE() {
    static String _ACRONYM_TYPE;
    if (_ACRONYM_TYPE.empty()) {
        _ACRONYM_TYPE = StandardTokenizer::TOKEN_TYPES()[StandardTokenizer::ACRONYM];
    }
    return _ACRONYM_TYPE;
}
kmatheussen commented 2 years ago

This looks like another race condition:

SimilarityPtr Similarity::getDefault() {
    static SimilarityPtr defaultImpl;
    if (!defaultImpl) {
        defaultImpl = newLucene<DefaultSimilarity>();
        CycleCheck::addStatic(defaultImpl);
    }
    return defaultImpl;
}
Mythicsoft commented 2 years ago

Nice work - thanks for looking into this.

kmatheussen commented 2 years ago

I've gone through all static variables now, and fixed them for multithreaded access: lucene.patch.txt I'll make a pull request tomorrow.

kmatheussen commented 2 years ago

Fixed in https://github.com/luceneplusplus/LucenePlusPlus/pull/183

dimgel commented 7 months ago

Can you please release 3.0.9 with this fix?

alanw commented 6 months ago

I've released 3.0.9 now Dmitry

dimgel commented 6 months ago

Thank you! :)