In Adams2019, training on a mixture of samples from different apps silently gives bad results

The retrain_cost_model binary in Adams2019 relies on the pipeline_id field to avoid re-reading pipeline features from subsequent samples. This speeds up loading a large number of samples. However, the autotuning script just sets pipeline_id to zero, unless you have multiple generator parameter sets. So if you try to train on samples from multiple pipelines generated with autotune_loop.sh without first doing a binary patch of the pipeline id field, it will treat the schedule features as if they all belong to a single pipeline. This forces the cost model to ignore pipeline features and to try to predict performance from schedule features alone. Even worse, during training, if that single pipeline has more pipeline stages than the current sample, then it will feed the neural network uninitialized memory when it reads off the end of the schedule features array.

Annoyingly, neural networks are really good at working around this sort of garbage, so you get something that still trains. It just produces a somewhat crappy cost model.

halide / Halide

In Adams2019, training on a mixture of samples from different apps silently gives bad results #7086