CUBoulder-HPCPerfAnalysis / memory

Experiments with memory performance

MIT License

2 stars 7 forks source link

...added script that ran various experiments, gathered data and a graphing R script.

Gathered the results on a 4-socket high-memory (1TB RAM) machine.

Username, Machinename, CPU name, CPU GHz, CPU Cores, CPU Cores used, L1 cache (MB), L2 cache (MB), L3 cache (MB), Array Length (MB)

dmitry, dav01, Xeon(R) CPU E7- 4870, 2.40, 10, Up to 32, 0.64, 2.56, 30.72, 76.3

Surprised to see a spike for Block=1 and ThreadCount=32. Expected to see the effect of the "false sharing" associated with a low performance. If there is a bug in my code, I was not able to find it. I would appreciate any advice on what I can try next to investigate and troubleshoot.

CUBoulder-HPCPerfAnalysis / memory

Changed stream.c to use the block cyclic algorithm for the dot product; ... #14

Username, Machinename, CPU name, CPU GHz, CPU Cores, CPU Cores used, L1 cache (MB), L2 cache (MB), L3 cache (MB), Array Length (MB)