Open abadams opened 2 years ago
Proposed solution: retrain_cost_model should at least hash the pipeline features on every loaded sample to verify that each sample with a given pipeline id does indeed have the same pipeline features. The autotune_loop.sh script should take a mandatory base pipeline id to use as an argument.
The retrain_cost_model binary in Adams2019 relies on the pipeline_id field to avoid re-reading pipeline features from subsequent samples. This speeds up loading a large number of samples. However, the autotuning script just sets pipeline_id to zero, unless you have multiple generator parameter sets. So if you try to train on samples from multiple pipelines generated with autotune_loop.sh without first doing a binary patch of the pipeline id field, it will treat the schedule features as if they all belong to a single pipeline. This forces the cost model to ignore pipeline features and to try to predict performance from schedule features alone. Even worse, during training, if that single pipeline has more pipeline stages than the current sample, then it will feed the neural network uninitialized memory when it reads off the end of the schedule features array.
Annoyingly, neural networks are really good at working around this sort of garbage, so you get something that still trains. It just produces a somewhat crappy cost model.