tomsanbear / bitnet-rs

Implementing the BitNet model in Rust
MIT License
22 stars 1 forks source link

The implementation of bitlinear is wrong #1

Closed suzuke closed 6 months ago

suzuke commented 6 months ago

The implementation of bitlinear in kyegomez/BitNet is totally wrong.

Gemma, beta, and alpha are calculated using weights and input before quantization. These parameters are then utilized for weights binarization and input quantization. The binarized weights and quantized input undergo linear operations to produce the output, which is then dequantized using the previously calculated gemma, beta. It's not meaningful to calculate gemma and beta separately for quantization and dequantization stages, and even the implementation of grouping here is entirely nonsensical.

suzuke commented 6 months ago

The issues I mentioned have been addressed in the commit 6cdb2ea998e843b454f2fbaaef73bc6bf92c305f.

tomsanbear commented 6 months ago

Thanks for the info, i'll take a look at that change and adjust this implementation to match 👍