rcgsheffield / sheffield_hpc

Docs for University of Sheffield HPC systems
https://docs.hpc.shef.ac.uk/
Other
52 stars 50 forks source link

Filestore diagram out of date #2016

Open jkwmoore opened 3 months ago

jkwmoore commented 3 months ago

See: https://docs.hpc.shef.ac.uk/en/latest/hpc/filestore.html#choosing-the-correct-filestore

Needs updating to remove ShARC and add Stanage.

See also: https://github.com/rcgsheffield/sheffield_hpc/blob/5bfbf14c3c7dc96e8859e9a550d5a4a690cc8e5c/hpc/filestore.rst?plain=1#L39-L46

jkwmoore commented 2 months ago

I don't think that merge truly fixed the issues here as:

We may want to explicitly refer to these areas by their type, i.e. scratch is node local storage, fastdata / parscratch is the cluster wide lustre area, /home/$USER and /users/$USER is the cluster wide user's home area.

Carldkennedy commented 2 months ago

Agree some more updates needed on another PR. Thought best to Swap out ShARC for Stanage asap. The questions are slightly different

Does your job read or write lots of small files?
Does your job read or write large or small files?

/fastdata relates to both clusters (we do refer to both as fastdata https://docs.hpc.shef.ac.uk/en/latest/hpc/filestore.html#fastdata-areas) Possible change /home to $HOME which is correct for both clusters Replace /scratch with Scratch - link to https://docs.hpc.shef.ac.uk/en/latest/hpc/filestore.html#scratch-directories Replace /fastdata with Fastdata - link to https://docs.hpc.shef.ac.uk/en/latest/hpc/filestore.html#fastdata-areas

jkwmoore commented 2 months ago

/fastdata relates to both clusters

We're kinda stuck in a catch 22 with /fastdata - if the slash is present people think it is a path not a "type" of store. If the slash is missing, it seems like a "type" rather than a path despite there being that actual path to the area on Bessemer.

We've been doing a kinda poor job of discriminating on this and have been kludging thus far with /fastdata meaning both which is a little too vague for my liking.

Sounds like we want to do some fairly sizeable refactoring work for our file-stores page to be honest. I am fine with us referring to these areas by their conceptual types (e.g. "Scratch"), but I'd just want to ensure we still capture/communicate their actual back end filesystem types (Lustre / Weka / NFS etc...) where appropriate.

Power users may already be familiar with these file systems after all.

So long as we're sufficiently clear with the on system path/s, the purpose of the area, type of filesystem, performant for data types:

e.g. /home /users, cluster wide user home area, NFS/Weka, numerous small files

I accept that we may need to refactor a whole bunch of stuff but this problem will also (re)occur with Weka appearing too.

The questions are slightly different

I see I misread it. That would somewhat underline the point that it is a bit clunky / not as clear as I think we want it to be mind you.


Overall, we may want to think about bifurcating this diagram to be per cluster to simplify and possibly consider a different type of diagram or table entirely given that we have a bunch of annoying complexity as a result of: