Perform scaling study on Ocelote using the 'big' sinusoidal benchmarks.
Ocelote has 2 Xeon Westmere-EP X5650s each with (according to intel's product sheets) has 6 cores each with 2 hyper-threads, for a total of 12 cores and 24 hyper-threads.
Testing should be done to illuminate differences along different paritionings of the cpus (ie, between chip boundaries, and hyper-threads).
A suggested set of thread counts: 1, 2, 4, 6, 8, 12, 16, 24. (@mstrout please comment)
Because time-steps are not parallel, currently suggest 1, maybe 2 time-steps.
Should conduct multiple trials, but because these tests can take over an hour on the high end, will need to be few if any additional trials. (@mstrout please comment)
Deliverables:
Data package:
Raw outputs from each run.
Collected data in json format:
runtime measured by time or bin/time (either is fine)
Per-rank memory as reported by amps memory allocator
Note: True OS view of memory occupation is challenging here see issue #7
information about parflow software stack.
Done when:
The data package is considered sufficient, and is uploaded here and probably somewhere else.
Blocked by #3 #1 For #4
Perform scaling study on Ocelote using the 'big' sinusoidal benchmarks. Ocelote has 2 Xeon Westmere-EP X5650s each with (according to intel's product sheets) has 6 cores each with 2 hyper-threads, for a total of 12 cores and 24 hyper-threads.
Testing should be done to illuminate differences along different paritionings of the cpus (ie, between chip boundaries, and hyper-threads). A suggested set of thread counts: 1, 2, 4, 6, 8, 12, 16, 24. (@mstrout please comment)
Because time-steps are not parallel, currently suggest 1, maybe 2 time-steps.
Should conduct multiple trials, but because these tests can take over an hour on the high end, will need to be few if any additional trials. (@mstrout please comment)
Deliverables:
Done when: