Open Fred-cell opened 9 months ago
As we talked offline, int4 may not be applicable for small models like whisper tiny and base. Int5 (sym or asym) might be a good alternative considering the wtf and wer. You can check if int5 can meet your demands.
Fred will test int5/fp8 and give feedback.
current issue is the precision for whisper with INT4. I have synced this with Kai.