Open steven-johnson opened 5 years ago
For --benchmarks=all
, it might also be useful to have RunGen be compilable without halide_image_io.h.
From a cursory examination, it appears the problem is compounded by some of the template usage in HalideBuffer.h itself; in particular, for_each_value()
instantiates a bunch of templated helper functions which differ if you pass in extra buffers as optional arguments; halide_image_io::convert_image() instantiates the full crossproduct of possible buffer types for doing conversions.
...ugh, and also: RunGenMain.o in the standard Makefile builds with debug info enabled and without optimization enabled.
It appears the likely culprit is rampant template specialization explosion in halide_image_io.h; RunGenMain.o can be >3MB and take > 10s of seconds to compile on some systems.