-
Microsoft released a new paper, which contains details and tips on training a ternary LLM. Might be useful!
- https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tip…
-
I am trying to reproduce your experimental results, so I am testing with the llama.cpp you provided. However, when I enable CUDA compute capability using the following command, the following error occ…
-
https://github.com/kyegomez/BitNet/blob/914bad9ba188dfc32e34a0a0a9ee042d7962e604/bitnet/bitbnet_b158.py#L52
I noticed that you're attempting to implement 1.58-bit quantization, but it seems you onl…
-
Thank you for such an inspiring work!
Could you please prepare a single example of the BitNet transformer performing language translations?
Like an adapted version of the following demo:
https://…
-
This issue is for the notification of papers which will be added to this repo in the future
-
## Describe the bug
Instantiating `BitFeedForward()` with `post_act_ln=False` will result in `TypeError: 'NoneType' object is not callable` in the `torch.nn` module. (Full traceback shown in “To Repr…
-
-
## Upvote & Fund
- We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank y…
-
beta and gamma sizes to be (1, weight.shape[0], not (weight.shape[0], 1) ???
## Upvote & Fund
- We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and help fund this issue.
- We …
-
**Describe the bug**
I print the mean and variance of the tensor y in example.py.
Its mean and variance are abnormal, as follows:
> mean and var of BitLinear output:
-0.567935049533844
1149.996…
-
model_id = "h2oai/h2o-danube-1.8b-chat"#
![image](https://github.com/kyegomez/BitNet/assets/123802672/e2dcdad7-470e-4202-ad05-ac71c3be3ac6)
## Upvote & Fund
- We're using [Polar.sh](https://po…