Closed Munksgaard closed 3 years ago
If all datasets cause all tuning runs to fail, or just some of them? In the former case I don't see how we can produce meaningful results at all. In the latter case, what should we do? Treat failing runs as having infinite runtime?
For context, what we do currently is: For each combination of dataset and threshold, we first do a sampling run with the threshold set to infinity. If that sampling run fails, we return infinity as the upper bound of the threshold interval. Meaning that, if all datasets cause the code to run out of memory for a specific threshold when set to infinity, the final threshold reported will be infinity. Instead, we should probably return 1 (the lowest valid threshold value), such that we choose the other code version in question. Hopefully that will not cause the code to run out of memory.
As described here,
futhark autotune
will produce wrong autotuning results if all datasets cause the tuning runs to fail (eg. by running out of memory).Thanks to Kristian Bøjer Andreasen for pointing this out.