ooibc88 / gam

Globally Addressable Memory management (efficient distributed memory management via RDMA and caching)
95 stars 42 forks source link

Adding MFence to enforce SC consistency doesn't work as expected #9

Open charles-typ opened 3 years ago

charles-typ commented 3 years ago

Hi @ooibc88 @cac2003 @guowentian

I have been trying to understand the impact of stronger consistency guarantees on application performance in GAM. To this end, I tried to enforce SC consistency by adding an MFence operation after each write (as suggested in Section 4 of the paper: “For example, sequential consistency can be easily achieved by inserting MFence following each Write operation.”). Below are details on the experimental setup, methodology and results.

Experiment setup:

  1. Two servers VM1 and VM2 with 512MB of local memory, and all memory used as cache.
  2. One server VM3 with all available DRAM used as local memory (~10GB), and no cache.

Therefore VM1 and VM2 fetch data from VM3 and keep it in their local cache.

Method:

I replayed several memory traces captured from different applications against GAM, under two scenarios (listed below), and recorded the execution time for both of them. The memory footprint of the application (~1GB) is larger than local cache size (512MB), so there are evictions along with invalidations. All memory accesses are 1 byte.

Scenario 1: Run an application with 10 threads on VM1, PSO consistency. Scenario 2: Run an application with 10 threads on VM1, enforce SC with memory fences.

Result:

I expected Scenario 2 to be slower since writes cannot be asynchronous anymore. However, Scenario 2 was actually faster than scenario 1 (by 5%-10%).

Questions:

Is the MFence operation completely supported in the current code base? Are there any benchmarks that compare SC and PSO consistency in the repo?

Thank you for taking the time to read this issue --- I would really appreciate any help!