rethinkdb / rethinkdb

The open-source database for the realtime web.
https://rethinkdb.com
Other
26.76k stars 1.86k forks source link

Optimize memory allocation on NUMA systems (regression) #2274

Open danielmewes opened 10 years ago

danielmewes commented 10 years ago

We should use libnuma [1] on Linux to allocate pages on the NUMA node of the cache that is going to use them. That will improve both performance and avoid problems like running out of usable RAM too early (see [2]).

Note that we are already allocating buffers on the right thread in 1.12. However #2130 makes it difficult to maintain this property.

A work-around for allocation problems is to launch rethinkdb with numactl --interleave all. However that won't help with the (potential) performance regression, and is annoying.

[1] http://linux.die.net/man/3/numa [2] http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/

coffeemug commented 10 years ago

Would you avoid shipping #2130 until this is done?

danielmewes commented 10 years ago

@coffeemug Not for the performance aspect of it, but probably for the running out of memory aspect of it.

We could do something simple at first, such as internally setting the allocator to interleaved mode on startup. We would still have to link with libnuma on Linux. @atnnn: How difficult is it to add libnuma as a dependency?

danielmewes commented 10 years ago

Actually, looking at the code again, it might be possible to maintain the local allocation even with the #2130 changes without any additional libraries. We should do that, preferably.

coffeemug commented 10 years ago

Yeah, that would be ideal.

AtnNn commented 10 years ago

Adding libnuma as a dependency would be easy. Versions of libnuma on the distributions we support range from 2.0.3 to 2.0.7. I don't think OS X has any support for it.

srh commented 10 years ago

There are multiple serializer threads, aren't there? Also, the problem already exists, should the balancer try to give most of the cache's memory to tables that reside on a certain node. We also already have this problem with read-ahead buffers. Addressing this question is not limited to deciding which thread buffers should be allocated on.

And if we do want to worry about memory usage, the first thing to worry about is whether we should switch to jemalloc. Based on internet posts that might be out of date, it seems like we should.

danielmewes commented 10 years ago

There's only one serializer thread per table, but 8 cache threads. It is unlikely that just one of the caches of a single table gets most of the memory assigned by the cache balancer (unless DoS or a highly uneven key access distribution).

Generally, allocating memory on the cache's thread is good because that's where it is accessed from most often. So it will be faster than simply switching the allocation strategy to "interleaved" (where memory will be evenly allocated from all NUMA nodes round robin or something similar). The advantage of using the interleaved strategy is that we won't have problems with corner cases such as unevenly distributed cache sizes.

@srh: Is jemalloc supposed to use less memory in general? Or does it actually do something about the problem of a single NUMA node running out of RAM while others have plenty free?

srh commented 10 years ago

You don't need one of the caches of a single table, you can just have all 8 of the caches get memory assigned by the cache balancer. They're probably all going to be on the same CPU.

@srh: Is jemalloc supposed to use less memory in general?

Yes, that's the difference. That info could be out of date. jemalloc's downside, relative to tcmalloc, is supposedly the performance you see when spawning threads on the fly, which we do not do.

danielmewes commented 10 years ago

I opened an issue about considering jemalloc https://github.com/rethinkdb/rethinkdb/issues/2279.

Usually the 8 caches will (almost) all be on different CPUs. Otherwise there wouldn't be a point in having them in the first place.

srh commented 10 years ago

Otherwise there wouldn't be a point in having them in the first place.

CPUs != cores.

danielmewes commented 10 years ago

Oh I see. Yeah you are right, that might already be a problem.