Closed mcourteaux closed 2 weeks ago
The for_each_value stuff, at the time it was written at least, autovectorizes nicely so it's a lot more efficient than it looks.
Are you using Halide as a JIT compiler? If you're using it as an AOT compiler that function should be linked. If it has been dead-stripped, we may be annotating it incorrectly. In general the global runtime functions are only available from JIT via methods on Internal::JITSharedRuntime. Unfortunately this is both in the internal namespace and is just a few memoization cache functions right now.
The device API functions are available from JIT-land via the functions in DeviceInterface.h
Halide::Runtime::Buffer may be used from JIT-land or AOT-land, so unfortunately it can't depend on things in the runtime other than those accessed via the device interface pointer.
Okay, so the answer is: not possible. Thanks for the quick reply!
I'm looking into an optimization that involves calling to
halide_buffer_copy()
. This function is part of the runtime and is defined in thedevice_interface.cpp
file, which gets processed by theLLVM_Runtime_Linker
.However, this function is not always available as the runtime somehow is not always linked at link-time, but can be only produced later, when JIT-ing. Is there a nice way to call some of the common "Univseral CPP Initmods" functions from within
Halide::Runtime::Buffer
? It's sort of counterintuitive that the runtime is not available from within one of the runtime headers.To avoid an A-B problem, what I was trying to do is replace all of this in
Halide::Runtime::Buffer::copy_from()
: https://github.com/halide/Halide/blob/9ba1829297bba557f25fb422518252144ccfa225/src/runtime/HalideBuffer.h#L1432-L1454 with one simple call tohalide_copy_buffer()
which is actually a quite-optimized procedure for this task (as opposed to thefor_each_value([&]....)
stuff):