Open jchodera opened 2 years ago
The A100s are on Perlmutter. They're 40 GB, 1410 MHz versions.
Maybe we should capture the output of nivida-smi -q
?
The datasheet says there's a bunch of flavors of A100:
The only difference between them is the amount of memory (40 or 80 GB) and the form factor (PCIe or SXM). Neither of those should have any difference in speed.
Here's what nvidia-smi reports on the login node with the GPU idle.
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 1215 MHz
Video : 585 MHz
Applications Clocks
Graphics : 765 MHz
Memory : 1215 MHz
Default Applications Clocks
Graphics : 765 MHz
Memory : 1215 MHz
Max Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1215 MHz
Video : 1290 MHz
Max Customer Boost Clocks
Graphics : 1410 MHz
Comparing to what you posted in https://github.com/openmm/openmm-org/pull/86#issuecomment-1007171890, the max clock rates for graphics, SM, and video are the same, but the memory is slightly lower. Other factors that can affect performance are the type of bus (PCIe or NVLink, and the particular version of either one), the cooling system (influences whether it can actually sustain the maximum clock rate, bus topology (mainly for multi-GPU benchmarks), and CPU type (it's not a huge effect for GPU benchmarks, but it does make a difference).
There seems to be significant variation in the performance of different models/variants of the same GPU (e.g. the multiple variants of A100 available), so we should provide more details in our benchmarks about exactly which model(s) were used.