efficient / libcuckoo

A high-performance, concurrent hash table
Other
1.61k stars 275 forks source link

giant executables when using libcuckoo #46

Closed ldalessa closed 8 years ago

ldalessa commented 8 years ago

We've been using libcuckoo happily for a while now but we've noticed recently that our builds are quite large (unit test executables > 20MB, builds > 1.4GB), and we're getting a huge number of libcuckoo and city_hash symbols. I don't think that we're doing anything unusual with libcuckoo.

Is this expected or is this a surprise? Is there a workaround that you could suggest? I can provide any debugging information necessary.

manugoyal commented 8 years ago

Huh, yeah that does sound weird. I haven't heard of that before. Yeah any additional information about what all these symbols are would be helpful.

Another thing that could be helpful is to figure out if there is a commit that is causing this issue. Can you perform a git-bisect of some sort on the libcuckoo repository and see if there is a specific commit that introduces this behavior?

ldalessa commented 8 years ago

So I'm not sure how helpful this will be, but the output from nm tests/unit/bcast | grep cuckoo > cuckoo.txt is attached for both the debug ("-O0 -g" and release ("-O3") builds. We regress both builds because we like to be able to see the stack trace from the jenkins debug build when things go south.

It's possible that this is just the natural consequence of using C++?

It's a bit hard for me to bisect because of the way we're integrated but I can try.

cuckoo_O0_g.txt cuckoo_O3.txt

manugoyal commented 8 years ago

Hmm looking through the symbols, they look like what I'd expect to see (mostly templated function definitions). I'm comparing the symbols from your O3 file and the symbol printout from the examples/hellohash, and they seem quite similar. I've attached my symbol printout (after running c++filt on it).

There were a couple symbols in your file I noticed that didn't show up in hellohash:

std::__shared_ptr<std::thread::_Impl<std::_Bind_simple<cuckoohash_map<void const*, unsigned long, CityHasher<void const*>, std::equal_to<void const*>, std::allocator<std::pair<void const* const, unsigned long> >, 4ul>::cuckoo_fast_double(unsigned long)::'lambda0'(unsigned long, unsigned long) (unsigned long, unsigned long)> >, (__gnu_cxx::_Lock_policy)2>::_Deleter<std::allocator<std::thread::_Impl<std::_Bind_simple<cuckoohash_map<void const*, unsigned long, CityHasher<void const*>, std::equal_to<void const*>, std::allocator<std::pair<void const* const, unsigned long> >, 4ul>::cuckoo_fast_double(unsigned long)::'lambda0'(unsigned long, unsigned long) (unsigned long, unsigned long)> > > >::operator()(std::thread::_Impl<std::_Bind_simple<cuckoohash_map<void const*, unsigned long, CityHasher<void const*>, std::equal_to<void const*>, std::allocator<std::pair<void const* const, unsigned long> >, 4ul>::cuckoo_fast_double(unsigned long)::'lambda0'(unsigned long, unsigned long) (unsigned long, unsigned long)> >*) (.isra.153.constprop.158)

and

std::_Sp_counted_deleter<std::thread::_Impl<std::_Bind_simple<cuckoohash_map<unsigned long, int, CityHasher<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int> >, 4ul>::cuckoo_fast_double(unsigned long)::'lambda0'(unsigned long, unsigned long) (unsigned long, unsigned long)> >*, std::__shared_ptr<std::thread::_Impl<std::_Bind_simple<cuckoohash_map<unsigned long, int, CityHasher<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int> >, 4ul>::cuckoo_fast_double(unsigned long)::'lambda0'(unsigned long, unsigned long) (unsigned long, unsigned long)> >, (__gnu_cxx::_Lock_policy)2>::_Deleter<std::allocator<std::thread::_Impl<std::_Bind_simple<cuckoohash_map<unsigned long, int, CityHasher<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int> >, 4ul>::cuckoo_fast_double(unsigned long)::'lambda0'(unsigned long, unsigned long) (unsigned long, unsigned long)> > > >, std::allocator<std::thread::_Impl<std::_Bind_simple<cuckoohash_map<unsigned long, int, CityHasher<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int> >, 4ul>::cuckoo_fast_double(unsigned long)::'lambda0'(unsigned long, unsigned long) (unsigned long, unsigned long)> > >, (__gnu_cxx::_Lock_policy)2>::_M_destroy()

The second type (__Sp_counted_deleter) occurred a bunch of times.

On my machine, the hellohash object file is 152KB and the executable is about 6KB. Is hellohash a very different size on your machine?

hellohash_symbols.txt

ldalessa commented 8 years ago

Are you looking at the libtool wrapper? I see the following with the "release" (O3) build.

ldalessa@cutter:~/libcuckoo/examples$ ls -all .libs/
total 356
drwxr-xr-x 2 ldalessa research   4096 Nov 11 12:04 .
drwxr-xr-x 4 ldalessa research   4096 Nov 11 12:04 ..
-rwxr-xr-x 1 ldalessa research  94213 Nov 11 12:04 count_freq
-rwxr-xr-x 1 ldalessa research  70731 Nov 11 12:04 hellohash
-rwxr-xr-x 1 ldalessa research 181596 Nov 11 12:04 nested_table

and with -O0 -g (our jenkins issue) the hellohash is 1.28MB instead of 70k.

These numbers are okay I guess. I looked back in time a bit and it looks like the libcuckoo fraction of the binary size has been constant for quite some time, so I'll look elsewhere for a potential reason for any increase we're seeing.

Thanks, Luke