Questions about unimelb servers

wfmackey commented 4 years ago

@HughParsonage some questions about running the model on the unimelb servers:

what is the estimated time to run the model ~50,000 times?
how long does it take for you to set up/start a run?
is there any sort of usage limit for us on the servers?
how long do we have access to the servers for?

Thanks!

HughParsonage commented 4 years ago

Unfortunately I introduced a performance regression for the last run (just done!) which I believe I fixed. The performance regression occurred in response to #45 where I made sure that changes to .first_day could invalidate the cache. Unfortunately it was too eager; it invalidated the cache every time, which quadrupled the runtime. (It had to read in the australia data each run, instead of holding it memory between runs).

I'd say about about 4000/hr. So 13 hrs.

Every run has been bespoke so far, though I'm trying to hold over lessons from the previous run. The challenges are load balancing between the servers, synchronization, and maximum utilization of the resources. To get started takes about 5 minutes, but that assumes I've correctly balanced the load from the outset and haven't made other noob errors.
Limited to 8 instances and 512 cores
Till August.

Both 3 and 4 are negotiable but obviously we would need to make a case. I've been able to make minor adjustments along the way no problem (e.g. I asked for 512 cores but only 4 instances, but it was not possible to reach 512 cores with only 4 instances so I asked for an increased the instance limit).

wfmackey commented 4 years ago

Okay that's great. It would be possible to update the app's data relatively often (maybe once every two days?).

HughParsonage commented 4 years ago

Yep. I'll try to think of an automated way to provide grids

grattan / covid19.model.sa2

Questions about unimelb servers #48