simonsobs-uk / data-centre

This tracks the issues in the baseline design of the SO:UK Data Centre at Blackett
https://souk-data-centre.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

Basic documentation on architecture of Blackett #6

Closed ickc closed 6 months ago

ickc commented 1 year ago

Running multi-nodes applications at Blackett would benefits from some knowledge on the architecture of Blackett. E.g.

  1. (compute capability) load-balancing w.r.t. heterogeneous nodes
  2. (network capability) how the nodes are connected and if there are bottle necks, say, communications between multiple nodes will be choking between 2 network switches

I think something like this would be a example to follow: Architecture - NERSC Documentation (but obviously don't need to be as detailed.)

I think we can help with documentations. If information is passed to us then we can compile a documentation like that. Information needed are probably something like:

  1. configuration of each node, e.g. N nodes has n-socket "MODEL" CPU with X amount of RAM. A raw table of hardware information is probably enough for us to compile a summary table.
  2. network topology, e.g. how they are connected, bandwidth, etc. This would sheds some light on how many nodes we can launch an MPI application until the network cannot cope up.

@rwf14f, what do you think about this? Thanks.

ickc commented 1 year ago

We'll come back to this issue only after we mature away from testbed. Will mark as pending.

ickc commented 11 months ago

Our current testbed has access to:

hostname CPU sockets cores exclusive to parallel universe
wn1905340.in.tier2.hep.manchester.ac.uk Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz 2 4  
wn5914090.in.tier2.hep.manchester.ac.uk Intel(R) Xeon(R) Gold 5222 CPU @ 3.80GHz 2 4  
wn5916090.in.tier2.hep.manchester.ac.uk Intel(R) Xeon(R) Gold 5222 CPU @ 3.80GHz 2 4  
wn5914340.in.tier2.hep.manchester.ac.uk Intel(R) Xeon(R) Gold 5215L CPU @ 2.50GHz 2 10 True
wn5916340.in.tier2.hep.manchester.ac.uk Intel(R) Xeon(R) Gold 5222 CPU @ 3.80GHz 2 4  
wn5917090.in.tier2.hep.manchester.ac.uk Intel(R) Xeon(R) Gold 5222 CPU @ 3.80GHz 2 4 True
ickc commented 8 months ago

More notes after today's meeting between SO:UK DC and Blackett people:

ickc commented 6 months ago

Closes via 2280193