Open NLPV2011 opened 3 months ago
Yes, at least at small model sizes, BitNet b158 performs worse than normal bfloat16 precision.
Yes, at least at small model sizes, BitNet b158 performs worse than normal bfloat16 precision.