Open nandeeka opened 2 weeks ago
Hi Nandeeka! Thanks for your issue. This is a known bug in the allocated kernel lowering procedure that has been fixed internally. The fix will be available in the next compiler release.
In the mean time, the bug is only triggered when the kernel do not use any PSUM tensor. You should be able to temporarily bypass the issue by declaring a dummy PSUM tensor inside the kernel.
Hi @aws-qieqingy, Thanks for getting back to me. I may be declaring this dummy tensor wrong, but I am seeing the same error even after adding the following to my kernel.
dummy = nl.ndarray((nl.par_dim(nl.tile_size.pmax), 1), dtype=in_tensor.dtype, buffer=nl.psum)
Thanks!
The PSUM tensor also needs to be allocated with nl.psum.allocate()
. Please post the error log if you are still encountering error after this.
I think the issue was that this buffer was never used, so never actually initialized. By using it (e.g., with a nki.isa.memset
), I seem to have fixed the problem.
For future reference, how do I produce the error log? The issue with the instructions seems to have been deleted. Thanks!
You can pass --logfile
to NEURON_CC_FLAGS
to store the log file. Here's a link to the detailed documentation on compiler flags at Neuron RTD: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/compiler/neuronx-cc/api-reference-guide/neuron-compiler-cli-reference-guide.html
I am trying to use the Allocation API to manually allocate tensors in my NKI kernel. Unfortunately, even with a simple kernel that exponentiates every element, I am seeing an error. I have confirmed that the kernel finish successfully with
nki.language.sbuf
, but does not work withnki.isa.sbuf.allocate
.I am getting a runtime error:
Environment: I started with the Neuron 2.20 DLAMI and installed the Allocation API using the .deb and .whl files @aws-serina-tan sent me.
Full Kernel: