Closed Eddie-Wang1120 closed 3 months ago
This PR is to extend llm_build_ffn() to support _scale tensors, which could make BitNet reuse llm_build_ffn(), also may suits for other models which use per-tensor quantization.
llm_build_ffn()
This PR is to extend
llm_build_ffn()
to support _scale tensors, which could make BitNet reusellm_build_ffn()
, also may suits for other models which use per-tensor quantization.