pasky / pachi

A fairly strong Go/Baduk/Weiqi playing program
http://pachi.or.cz/
GNU General Public License v2.0
514 stars 117 forks source link

cache locality project coordination #30

Open sthalik opened 8 years ago

sthalik commented 8 years ago

Hey,

Responding to @pasky's earlier request for work regarding struct tree_node cache locality.

The issue is not completely trivial, particularly due to threading and the arbitrary order in which nodes are traversed. This is not merely refactoring.

My best idea is to allocate pages, or several, on a per-thread basis. Get rid of sibling pointer, next sibling can be calculated by parent plus pointer arithmetic of parent->data with regard to given struct tree_node's address. There probably needs to be stored the thread's id as well as a reference count in the page's end or beginning.

Some questions remain, most importantly - can multiple threads add subnodes to the same node?

Another idea is to realloc(3) node's data keeping excess free capacity. But threading again.

pasky commented 8 years ago

All subnodes are allocated in batch during the node expansion. Two threads can theoretically race to expand the same node, but right now that does no harm (except wasted CPU).

sthalik commented 8 years ago

@pasky there are two paths, one for fast_malloc after pruning and the second one.

The locality problem you presented before - can it be helped with using per-thread arenas? It's doable.