For modeling, we need a data set that spans the dimensionality of the parameters we assume impact the runtime and footprint of Parflow.
Currently we only have an understanding of performance as X and Y are varied for a constant Z, nubmer of time-steps, and single process topology (1,1,1).
We currently believe that many (all?) of these parameters are important factors in parflow runtime and footprint, but have little intuition or actual supporting data for that.
We need to sample the product of reasonable problem configurations.
Here, reasonable means that an individual parameter's value would not be unexpected by parflow users.
An example of this is our previous assumptions that Z is in { 5, 10 }
Part of this involves deciding what other reasonable values for the other parameters are.
I suggest that X and Y are mostly free of restrictions, though should have a reasonable upper-bound.
Z should stay in {5, 10}.
From previous evaluations it was determined that NQ (processes in Z) should be 1, as Z is so small it doesn't make sense to partition.
Previous evaluations also suggest that varying X most and Y least is most reasonable, but this needs to be reviewed, as it may depend on the size configuration.
Deliverable:
Data package:
Raw outputs from each run.
Collected data in json format:
runtime measured by time or bin/time (either is fine)
Per-rank memory as reported by amps memory allocator
Note: True OS view of memory occupation is challenging here see issue #7
information about parflow software stack.
Done when:
The data package is uploaded here and probably somewhere else.
The data is sufficient to provide the model with the accuracy discussed in #5.
For #5 blocked by #1
For modeling, we need a data set that spans the dimensionality of the parameters we assume impact the runtime and footprint of Parflow. Currently we only have an understanding of performance as X and Y are varied for a constant Z, nubmer of time-steps, and single process topology (1,1,1). We currently believe that many (all?) of these parameters are important factors in parflow runtime and footprint, but have little intuition or actual supporting data for that.
We need to sample the product of reasonable problem configurations. Here, reasonable means that an individual parameter's value would not be unexpected by parflow users. An example of this is our previous assumptions that Z is in { 5, 10 }
Part of this involves deciding what other reasonable values for the other parameters are. I suggest that X and Y are mostly free of restrictions, though should have a reasonable upper-bound. Z should stay in {5, 10}. From previous evaluations it was determined that NQ (processes in Z) should be 1, as Z is so small it doesn't make sense to partition. Previous evaluations also suggest that varying X most and Y least is most reasonable, but this needs to be reviewed, as it may depend on the size configuration.
Deliverable:
Done when: