halide / Halide

a language for fast, portable data-parallel computation
https://halide-lang.org
Other
5.91k stars 1.07k forks source link

Q: How do I call runtime functions from Halide::Runtime::Buffer? #8462

Closed mcourteaux closed 2 weeks ago

mcourteaux commented 2 weeks ago

I'm looking into an optimization that involves calling to halide_buffer_copy(). This function is part of the runtime and is defined in the device_interface.cpp file, which gets processed by the LLVM_Runtime_Linker.

However, this function is not always available as the runtime somehow is not always linked at link-time, but can be only produced later, when JIT-ing. Is there a nice way to call some of the common "Univseral CPP Initmods" functions from within Halide::Runtime::Buffer? It's sort of counterintuitive that the runtime is not available from within one of the runtime headers.


To avoid an A-B problem, what I was trying to do is replace all of this in Halide::Runtime::Buffer::copy_from(): https://github.com/halide/Halide/blob/9ba1829297bba557f25fb422518252144ccfa225/src/runtime/HalideBuffer.h#L1432-L1454 with one simple call to halide_copy_buffer() which is actually a quite-optimized procedure for this task (as opposed to the for_each_value([&]....) stuff):

halide_buffer_copy(nullptr, src, dst.buf.device_interface, dst);
abadams commented 2 weeks ago

The for_each_value stuff, at the time it was written at least, autovectorizes nicely so it's a lot more efficient than it looks.

Are you using Halide as a JIT compiler? If you're using it as an AOT compiler that function should be linked. If it has been dead-stripped, we may be annotating it incorrectly. In general the global runtime functions are only available from JIT via methods on Internal::JITSharedRuntime. Unfortunately this is both in the internal namespace and is just a few memoization cache functions right now.

The device API functions are available from JIT-land via the functions in DeviceInterface.h

Halide::Runtime::Buffer may be used from JIT-land or AOT-land, so unfortunately it can't depend on things in the runtime other than those accessed via the device interface pointer.

mcourteaux commented 2 weeks ago

Okay, so the answer is: not possible. Thanks for the quick reply!