pmodels / armci-mpi

An implementation of ARMCI using MPI one-sided communication (RMA)
https://wiki.mpich.org/armci-mpi/index.php/Main_Page
Other
13 stars 6 forks source link

use faster GMR lookup #18

Open jeffhammond opened 5 years ago

jeffhammond commented 5 years ago

AVL tree will be faster than linked-list traversal for GMR lookup.

@jdinan noted:

Fortunately, ARMCI-MPI already has an AVL tree: https://github.com/pmodels/armci-mpi/blob/master/src/conflict_tree.c

Not sure why I didn't use this for GMR lookups. I suspect the reason is that when the number of memory regions is small, the performance improvement is negligible.

Obsolete comment by @jeffhammond:

https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/common/avl/avl.c exists but I do not know if CDDL is acceptable in ARMCI-MPI. If not, we'll have to implement from scratch.

Migrated from https://github.com/jeffhammond/armci-mpi/issues/25

jdinan commented 5 years ago

AVL tree will be faster than a linked list when the number of memory regions becomes large enough. This optimization was considered and not implemented in the early days of ARMCI-MPI because we didn't see cases where there would be enough performance benefit to justify the increase in complexity. The situation may be different now, but I wanted to make sure this rationale is captured so the current implementation is not considered to be a performance bug.

jeffhammond commented 5 years ago

Yes, I plan to support AVL as an option, in addition to forwards and backwards linear traversal. We can see which one wins for different workloads.

Slab suballocation will eliminate lookup overhead, which is also on my TODO list.