N8-CIR-Bede / documentation

Documentation for the N8CIR Bede Tier 2 HPC faciltiy
https://bede-documentation.readthedocs.io/en/latest/
7 stars 11 forks source link

Resnet50 benchmark scripts no longer usable #72

Closed ptheywood closed 2 years ago

ptheywood commented 3 years ago

The Resenet50 benchmark job scripts are no longer usable on bede, as /opt/software/apps/anaconda3 does not exist.

Additionally, moving to RHEL8 where WMLCE is not supported (instead replaced by OpenCE) it is unclear if ddlrun and therefore bede-ddlrun will be usable.

It may be worth re-benchmarking RESNET50 prior to the RHEL8 switch so we know the performance impact of WMLCE vs OpenCE?

I.e. run RESNET50 at a number of scales (1, 2, 4, 8, 12?, 16? GPUs, current docs say no need to go larger) with:

The current RHEL8 testing partition only conatisn 2 nodes, so only up to 8 GPUs will currently be usable for RHEL8.

This is closely related to #63

ptheywood commented 2 years ago

Not adding Resenet benchmark results to the WMLCE / OpenCE documentation:

It may be nice to add some general DL benchmarking to compare against x86+V100 systems to support encouraging usesrs onto Bede, but that can become a future issue rather than blocking WMLCE/OpenCE clarrification.