NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines
Other
5.7k stars 978 forks source link

[QST] Integer 16 support #1841

Open sycz00 opened 1 month ago

sycz00 commented 1 month ago

What is your question? Hey folks,

first of all, thanks for this framework. I was recently implementing a cuda kernel for a Linear Layer and a Convolutional 2D layer using Int8 arithemtric and saving it into int32 tensors. I was wondering, when implementing a similar kernel for int16 input and weight and saving it into int32/64 does not work. I figured out, that there is no available kernel configuration for such operation. Do you think this might be added somehow or do you know a fancy trick or workaround to accomplish this task ?

Thanks for your help !

thakkarV commented 1 month ago

The reason why these do not exist is because there is no native tensor core instruction for int16 inputs. You can take multiple paths here:

github-actions[bot] commented 4 weeks ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.