Open TimDettmers opened 3 years ago
Tesla P40 & P100 have become popular options for homelab AI builds. While I'd completely understand if these Pascal architectures are not a priority, I just wanted to share that spike in popularity that is being driven by the cards reaching the $200 mark (for 24GB of GDDR5, or 16GB of HBM2).
Given that milestone, and the fact that the P40 in particular has strong/native INT8 performance but abysmal FP16 performance, it might be a beneficial boost to the broader Open Source community if these can be supported by everyone's favorite quantization framework :).
The following tests fail on Pascal:
My guess is this is probably due to
atomicAdd
for floats working differently.