Closed tavareshugo closed 3 years ago
Some yes/no questions:
zip
, unzip
and tar
) on the login node? A: kind of??zip --help
or man zip
) on the login node? (A: Yes)For the last one, easier to understand if put into detail: for example, if a workflow has two steps, the first step uses 40G of memory, 1 CPU and the 2nd step can be parallelized to 8 CPUs and take just a few hundred Mb memory. How much resources would you ask for this?
Option 1: 40G and 8 CPU Option 2: 42G and 8 CPU Option 3: 40G and 1 CPU Option 4: Others
The best answer is others (option 4): best to split to two jobs/scripts: the first one asks for 40G(42G?) mem and 1 CPU, the 2nd one asks for 1G mem and 8 CPU.
I'm not sure to ask for 42G or 40G, as I always gives it a bit of buffer. Not sure whether it is the right thing to do, or the right place to talk about it.
One of the most frequently asked question is how much memory to ask for... I attempted an answer here: https://wiki.cam.ac.uk/plantsci-bioinfo/Condor_User_Guide#How_much_memory_I_should_request_for_my_job.3F
Q: If I accidentally delete one file from the cluster, can if recover it?
Option 1: Yes Option 2: No Option 3: it depends
The correct answer is option 3. then the trainer can elaborate on it. I guess it answer has something to do with the frequency of the backup?
This one is not that important, just for people who think this way :)
Q: If one of the hard disk on the cluster break and my data is on the disk, what will happen?
Option 1: I will loose my data Option 2: I won't loose my data Option 3: it depends
The correct answer is option 3. then the trainer can elaborate on it. I guess it answer it depends on whether the cluster has redundancy setup.
Q: If I accidentally delete one file from the cluster, can if recover it?
From what we are teaching, the answer would be "No". On the university HPC's working space (called "rds", which we're calling "scratch" on the course), I don't think you can recover a file if you delete it. But I like the question because it lets us discuss that different HPC maybe have different kind of setup.
Q: If one of the hard disk on the cluster break and my data is on the disk, what will happen?
I like this one as well, it let's us discuss the difference between redundancy in the storage compared to a true backup with some snapshots that let you "travel back in time" :)
Use this issue to compile questions that could be used in an interactive quiz to discuss best uses of the HPC. These could be used with the first session (intro to HPC) or in the session when we talk about Cambridge HPC more specifically.