This PR modifies knososs.h to support scenarios where we want to prohibit any allocation taking place, e.g. for a basic version of GPU support.
When allocation is disabled you get a runtime assertion failure if tensor::create is called.
I would have much preferred to make this a compile-time failure. However, this would have meant that prelude.ks could not be compiled. (prelude.ks contains several build calls, including some generated by gdefs.)
My initial plan was to provide a separate prelude-cuda.ks, removing the functions that do allocation. But although this works, I found some problems:
Maintaining prelude.ks and prelude-cuda.ks separately would result in a huge amount of duplication. We could split prelude.ks up into the parts that require allocation and the parts that do not; but that destroys the logical structure of the file. (For example, max does not require allocation, but [rev max] does.) Also if we split prelude.ks up then most users will need to specify both sections of the prelude in the ksc command line, which is awkward. (We don't have any mechanism to include one .ks file from another.)
The initial version of CUDA support won't support allocation, but we might plan to implement that in future. If we created prelude-cuda.ks now, and changed all the callers to use the correct preludes, we'd end up undoing all this work if we implement allocation later.
An alternative would be for ksc to strip out any unused defs as the final stage of compilation. Then we could get a compile-time error if any remaining defs involved a build (or another allocating function). I could try implementing this if we thought it was a good idea (it might also be good for reducing code size and C++ compile times). But, nearly all of this PR would still be necessary in that case: the difference would be that uses of tensor::create would cause a compile-time error rather than an assertion failure.
This PR modifies
knososs.h
to support scenarios where we want to prohibit any allocation taking place, e.g. for a basic version of GPU support.When allocation is disabled you get a runtime assertion failure if
tensor::create
is called.I would have much preferred to make this a compile-time failure. However, this would have meant that
prelude.ks
could not be compiled. (prelude.ks
contains severalbuild
calls, including some generated by gdefs.)My initial plan was to provide a separate
prelude-cuda.ks
, removing the functions that do allocation. But although this works, I found some problems:prelude.ks
andprelude-cuda.ks
separately would result in a huge amount of duplication. We could splitprelude.ks
up into the parts that require allocation and the parts that do not; but that destroys the logical structure of the file. (For example,max
does not require allocation, but[rev max]
does.) Also if we splitprelude.ks
up then most users will need to specify both sections of the prelude in the ksc command line, which is awkward. (We don't have any mechanism to include one.ks
file from another.)prelude-cuda.ks
now, and changed all the callers to use the correct preludes, we'd end up undoing all this work if we implement allocation later.An alternative would be for ksc to strip out any unused defs as the final stage of compilation. Then we could get a compile-time error if any remaining defs involved a
build
(or another allocating function). I could try implementing this if we thought it was a good idea (it might also be good for reducing code size and C++ compile times). But, nearly all of this PR would still be necessary in that case: the difference would be that uses oftensor::create
would cause a compile-time error rather than an assertion failure.