Open kpet opened 1 year ago
I wonder if there is currently good and handy type conversion function for packed4x8 type in integer_dot_product, like packHalf2x16() unpackHalf2x16() for packed fp16. Shouldn't there be packInt4x8EXT() packUInt4x8EXT() unpackInt4x8EXT() unpackUInt4x8EXT() for packed 8bit integer ?
Existing conversions such as packSnorm4x8() will lose precision due to norm and are not suitable for use in computational tasks.
A common scenario: Some gpu drivers do not support 8bit/16bit storage feature, so cannot read and write with 8bit integer But we can pack 4x8bit into a 32bit and unpack it in shader for calculation, which can significantly reduce the bandwidth overhead of gpu memory
I admit that we could simulate this pack/unpack process with bit shift/or/and operations, but doing so is complex and limited and can be inefficient
I wonder if there is currently good and handy type conversion function for packed4x8 type in integer_dot_product
Currently no, I don't think there is. One could be added but ...
I admit that we could simulate this pack/unpack process with bit shift/or/and operations, but doing so is complex and limited and can be inefficient
I'm not sure why you say that they are limited, so perhaps you have something different in mind, but compilers should already understand that these are pack/unpack operations and be emitting good instruction sequences for them. I don't think we should expect any gain in efficiency from adding functions for this.
Thanks for the fixes. What's the state of implementations on this? Is it ready to be merged now?
Thanks for the fixes. What's the state of implementations on this? Is it ready to be merged now?
Thanks for the review! As far as I'm aware there are two prototypes from two different people, neither of which is complete. Neither author is able to continue work on their implementation. I expect this to be discussed in today's Vulkan ML teleconference and hopefully someone will be able to pick up the work. We could wait until we have a complete implementation to merge just in case implementation work turns up spec issues but I think this spec has been reviewed a good number of times by now so maybe publishing is helpful. I don't feel too strongly either way.
Signed-off-by: Kevin Petit kevin.petit@arm.com Change-Id: Ifc55b44217819cf5c6998eeb4aedcc8d7597e65d