HPCE / hpce-2017-cw5

1 stars 6 forks source link

Lack of compilation/performance runs #47

Closed m8pple closed 6 years ago

m8pple commented 6 years ago

I've been having great fun with the auto tests over the past few days, where certain implementations were managing to (I think) exhaust memory on the machine, which then freezes the machine. I'm not quite sure how this is happening (there is an outer memory jail in place), and haven't been able to diagnose it or strengthen or add internal memory jails enough to stop it happening. Another possibility is that the GPU driver is somehow taking out the machine - this happens a fair amount on consumer devices, but I haven't seen the AWS GPUs take out the machine before.

Anyway, I've fallen back on just adding logic to not run any implementation where a test run has already started, then every time the machine freezes just start the script from just after the place it left off. This may take a while, as sometimes the machine gets into a mode where you can't stop it from the Amazon console (yay - watch the money burn on a machine you can't log into!), so you have to terminate then rebuild.