ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.59k stars 9.41k forks source link

Extend llm_build_ffn() to support _scale tensors #8103

Closed Eddie-Wang1120 closed 3 months ago

Eddie-Wang1120 commented 3 months ago

This PR is to extend llm_build_ffn() to support _scale tensors, which could make BitNet reuse llm_build_ffn(), also may suits for other models which use per-tensor quantization.