Open sycz00 opened 1 month ago
The reason why these do not exist is because there is no native tensor core instruction for int16 inputs. You can take multiple paths here:
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
What is your question? Hey folks,
first of all, thanks for this framework. I was recently implementing a cuda kernel for a Linear Layer and a Convolutional 2D layer using Int8 arithemtric and saving it into int32 tensors. I was wondering, when implementing a similar kernel for int16 input and weight and saving it into int32/64 does not work. I figured out, that there is no available kernel configuration for such operation. Do you think this might be added somehow or do you know a fancy trick or workaround to accomplish this task ?
Thanks for your help !